View Issue Details

IDProjectCategoryView StatusLast Update
0015399CentOS-7kernelpublic2021-11-19 11:18
Reportertoastboy Assigned To 
PrioritynormalSeverityblockReproducibilityalways
Status newResolutionopen 
PlatformAWS EC2 m5 instanceOSCentOS 7OS Version7.5.1804
Product Version7.5.1804 
Summary0015399: xfsaild blocks after a while
DescriptionOn kernel 3.10.0-862.11.6.el7.x86_64 #1 SMP Tue Aug 14 21:49:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

After a few days of operation, nvme-backed filesystem fails as follows :-

Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: INFO: task xfsaild/dm-0:573 blocked for more than 120 seconds.
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: xfsaild/dm-0 D ffff949ddea08fd0 0 573 2 0x00000080
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: Call Trace:
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa66a697e>] ? try_to_del_timer_sync+0x5e/0x90
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d18db9>] schedule+0x29/0x70
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0473e26>] _xfs_log_force+0x1c6/0x2c0 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa66d1fe0>] ? wake_up_state+0x20/0x20
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc047ffdc>] ? xfsaild+0x16c/0x6f0 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0473f4c>] xfs_log_force+0x2c/0x70 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc047fe70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc047ffdc>] xfsaild+0x16c/0x6f0 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc047fe70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa66bdf21>] kthread+0xd1/0xe0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa66bde50>] ? insert_kthread_work+0x40/0x40
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d255f7>] ret_from_fork_nospec_begin+0x21/0x21
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa66bde50>] ? insert_kthread_work+0x40/0x40
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: INFO: task repl wr.orker 4:2167 blocked for more than 120 seconds.
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: repl wr.orker 4 D ffff949da2b44f10 0 2167 1 0x00000080
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: Call Trace:
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa67faec2>] ? kmem_cache_alloc+0x1c2/0x1f0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0470387>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d18db9>] schedule+0x29/0x70
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d1a6a5>] rwsem_down_write_failed+0x225/0x3a0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0470ac4>] ? xlog_grant_head_check+0x54/0x100 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc045fabd>] ? xfs_vn_update_time+0xcd/0x150 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa695f2e7>] call_rwsem_down_write_failed+0x17/0x30
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d17f2d>] down_write+0x2d/0x3d
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0462864>] xfs_ilock+0xc4/0x120 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc045fabd>] xfs_vn_update_time+0xcd/0x150 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa683bb68>] update_time+0x28/0xd0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d16d40>] ? bit_wait+0x50/0x50
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa683bcb0>] file_update_time+0xa0/0xf0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0455bd5>] xfs_file_aio_write_checks+0x185/0x1f0 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc04561fa>] xfs_file_buffered_aio_write+0xca/0x2c0 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc0462bf4>] ? xfs_iunlock+0x104/0x130 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffc045657d>] xfs_file_aio_write+0x18d/0x1b0 [xfs]
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa681e6a3>] do_sync_write+0x93/0xe0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa681f180>] vfs_write+0xc0/0x1f0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6820172>] SyS_pwrite64+0x92/0xc0
Oct 19 22:31:03 aws-prod-mongocadb-060211.prod.ucds.io kernel: [<ffffffffa6d2579b>] system_call_fastpath+0x22/0x27

This is highly reminiscent of https://bugs.centos.org/view.php?id=13843.

The filesystem is XFS on LVM-managed partition.
Steps To ReproduceCreate LVM-backed XFS partition on EBS attached to an M5 EC2 AWS.
Only 5-series AWS instances exhibit this issue. Their EBS volumes are exposed as /dev/nvmeXnX.
M4 instances do not exhibit this issue, their EBS volumes are exposed as /dev/xvdX.

Tags7.5, lvm, xfs
abrt_hash
URL

Activities

toastboy

toastboy

2018-10-25 12:54

reporter   ~0032992

dmesg output from a recently failed node

[832321.822415] INFO: task xfsaild/dm-7:576 blocked for more than 120 seconds.
[832321.823501] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832321.824670] xfsaild/dm-7 D ffff8fba5b759fa0 0 576 2 0x00000080
[832321.825752] Call Trace:
[832321.826144] [<ffffffff92ea697e>] ? try_to_del_timer_sync+0x5e/0x90
[832321.827055] [<ffffffff93518db9>] schedule+0x29/0x70
[832321.827825] [<ffffffffc0455e26>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[832321.828740] [<ffffffff92ed1fe0>] ? wake_up_state+0x20/0x20
[832321.829594] [<ffffffffc0461fdc>] ? xfsaild+0x16c/0x6f0 [xfs]
[832321.830456] [<ffffffffc0455f4c>] xfs_log_force+0x2c/0x70 [xfs]
[832321.831331] [<ffffffffc0461e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[832321.832391] [<ffffffffc0461fdc>] xfsaild+0x16c/0x6f0 [xfs]
[832321.833219] [<ffffffffc0461e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[832321.834263] [<ffffffff92ebdf21>] kthread+0xd1/0xe0
[832321.834987] [<ffffffff92ebde50>] ? insert_kthread_work+0x40/0x40
[832321.835881] [<ffffffff935255f7>] ret_from_fork_nospec_begin+0x21/0x21
[832321.836845] [<ffffffff92ebde50>] ? insert_kthread_work+0x40/0x40
[832321.837746] INFO: task mongod:1350 blocked for more than 120 seconds.
[832321.838682] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832321.839799] mongod D ffff8fba3bf5dee0 0 1350 1 0x00000080
[832321.840869] Call Trace:
[832321.841265] [<ffffffff92ffaec2>] ? kmem_cache_alloc+0x1c2/0x1f0
[832321.842159] [<ffffffffc0452387>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[832321.843085] [<ffffffff93518db9>] schedule+0x29/0x70
[832321.843838] [<ffffffff9351a6a5>] rwsem_down_write_failed+0x225/0x3a0
[832321.844787] [<ffffffffc0452ac4>] ? xlog_grant_head_check+0x54/0x100 [xfs]
[832321.845799] [<ffffffffc0441abd>] ? xfs_vn_update_time+0xcd/0x150 [xfs]
[832321.846764] [<ffffffff9315f2e7>] call_rwsem_down_write_failed+0x17/0x30
[832321.847742] [<ffffffff93517f2d>] down_write+0x2d/0x3d
[832321.848522] [<ffffffffc0444864>] xfs_ilock+0xc4/0x120 [xfs]
[832321.849353] [<ffffffffc0441abd>] xfs_vn_update_time+0xcd/0x150 [xfs]
[832321.850294] [<ffffffff9303bb68>] update_time+0x28/0xd0
[832321.851086] [<ffffffff9303bcb0>] file_update_time+0xa0/0xf0
[832321.851932] [<ffffffffc0437bd5>] xfs_file_aio_write_checks+0x185/0x1f0 [xfs]
[832321.852975] [<ffffffffc04381fa>] xfs_file_buffered_aio_write+0xca/0x2c0 [xfs]
[832321.854025] [<ffffffffc043857d>] xfs_file_aio_write+0x18d/0x1b0 [xfs]
[832321.854977] [<ffffffff9301e6a3>] do_sync_write+0x93/0xe0
[832321.855775] [<ffffffff9301f180>] vfs_write+0xc0/0x1f0
[832321.856551] [<ffffffff93020172>] SyS_pwrite64+0x92/0xc0
[832321.857324] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
[832321.858236] INFO: task ApplyBa.Journal:2178 blocked for more than 120 seconds.
[832321.859307] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832321.860456] ApplyBa.Journal D ffff8fba1dea0fd0 0 2178 1 0x00000080
[832321.861530] Call Trace:
[832321.861918] [<ffffffff93518db9>] schedule+0x29/0x70
[832321.862669] [<ffffffffc0456378>] _xfs_log_force_lsn+0x2e8/0x340 [xfs]
[832321.863618] [<ffffffff92ed1fe0>] ? wake_up_state+0x20/0x20
[832321.864445] [<ffffffffc0436b97>] xfs_file_fsync+0x107/0x1e0 [xfs]
[832321.865344] [<ffffffff930531b7>] do_fsync+0x67/0xb0
[832321.866080] [<ffffffff930534c3>] SyS_fdatasync+0x13/0x20
[832321.866871] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
[832321.867761] INFO: task oneagentloganal:30943 blocked for more than 120 seconds.
[832321.868850] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832321.869977] oneagentloganal D ffff8fba5eed1fa0 0 30943 28314 0x00000080
[832321.871044] Call Trace:
[832321.871438] [<ffffffff93518db9>] schedule+0x29/0x70
[832321.872165] [<ffffffff9351a3ed>] rwsem_down_read_failed+0x10d/0x1a0
[832321.873101] [<ffffffffc0444934>] ? xfs_ilock_attr_map_shared+0x34/0x40 [xfs]
[832321.874138] [<ffffffff9315f2b8>] call_rwsem_down_read_failed+0x18/0x30
[832321.875106] [<ffffffff93517ee0>] down_read+0x20/0x40
[832321.875867] [<ffffffffc044487c>] xfs_ilock+0xdc/0x120 [xfs]
[832321.876718] [<ffffffffc0444934>] xfs_ilock_attr_map_shared+0x34/0x40 [xfs]
[832321.877740] [<ffffffffc03f5958>] xfs_attr_get+0xd8/0x1a0 [xfs]
[832321.878628] [<ffffffffc0451b0d>] xfs_xattr_get+0x3d/0x80 [xfs]
[832321.879513] [<ffffffff930467b2>] generic_getxattr+0x52/0x70
[832321.880348] [<ffffffff930d2ee0>] get_vfs_caps_from_disk+0x70/0x180
[832321.881266] [<ffffffff92f2bd6d>] audit_copy_inode+0x6d/0xb0
[832321.882091] [<ffffffff92f329ba>] __audit_inode+0x18a/0x3c0
[832321.882909] [<ffffffff9302e7dc>] filename_lookup+0x7c/0xc0
[832321.883759] [<ffffffff93031bc7>] user_path_at_empty+0x67/0xc0
[832321.884615] [<ffffffff92f27b42>] ? from_kgid_munged+0x12/0x20
[832321.885476] [<ffffffff9302511f>] ? cp_new_stat+0x14f/0x180
[832321.886319] [<ffffffff93031c31>] user_path_at+0x11/0x20
[832321.887119] [<ffffffff93024c13>] vfs_fstatat+0x63/0xc0
[832321.887904] [<ffffffff9302517e>] SYSC_newstat+0x2e/0x60
[832321.888701] [<ffffffff9302545e>] SyS_newstat+0xe/0x10
[832321.889472] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
[832441.890621] INFO: task xfsaild/dm-7:576 blocked for more than 120 seconds.
[832441.891708] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832441.892864] xfsaild/dm-7 D ffff8fba5b759fa0 0 576 2 0x00000080
[832441.893976] Call Trace:
[832441.894381] [<ffffffff92ea697e>] ? try_to_del_timer_sync+0x5e/0x90
[832441.895362] [<ffffffff93518db9>] schedule+0x29/0x70
[832441.896139] [<ffffffffc0455e26>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[832441.897077] [<ffffffff92ed1fe0>] ? wake_up_state+0x20/0x20
[832441.897918] [<ffffffffc0461fdc>] ? xfsaild+0x16c/0x6f0 [xfs]
[832441.898784] [<ffffffffc0455f4c>] xfs_log_force+0x2c/0x70 [xfs]
[832441.899668] [<ffffffffc0461e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[832441.900715] [<ffffffffc0461fdc>] xfsaild+0x16c/0x6f0 [xfs]
[832441.901556] [<ffffffffc0461e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[832441.902614] [<ffffffff92ebdf21>] kthread+0xd1/0xe0
[832441.903357] [<ffffffff92ebde50>] ? insert_kthread_work+0x40/0x40
[832441.904261] [<ffffffff935255f7>] ret_from_fork_nospec_begin+0x21/0x21
[832441.905217] [<ffffffff92ebde50>] ? insert_kthread_work+0x40/0x40
[832441.906132] INFO: task mongod:1350 blocked for more than 120 seconds.
[832441.907089] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832441.908209] mongod D ffff8fba3bf5dee0 0 1350 1 0x00000080
[832441.909287] Call Trace:
[832441.909691] [<ffffffff92ffaec2>] ? kmem_cache_alloc+0x1c2/0x1f0
[832441.910574] [<ffffffffc0452387>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[832441.911510] [<ffffffff93518db9>] schedule+0x29/0x70
[832441.912254] [<ffffffff9351a6a5>] rwsem_down_write_failed+0x225/0x3a0
[832441.913215] [<ffffffffc0452ac4>] ? xlog_grant_head_check+0x54/0x100 [xfs]
[832441.914224] [<ffffffffc0441abd>] ? xfs_vn_update_time+0xcd/0x150 [xfs]
[832441.915187] [<ffffffff9315f2e7>] call_rwsem_down_write_failed+0x17/0x30
[832441.916176] [<ffffffff93517f2d>] down_write+0x2d/0x3d
[832441.916946] [<ffffffffc0444864>] xfs_ilock+0xc4/0x120 [xfs]
[832441.917785] [<ffffffffc0441abd>] xfs_vn_update_time+0xcd/0x150 [xfs]
[832441.918739] [<ffffffff9303bb68>] update_time+0x28/0xd0
[832441.919509] [<ffffffff9303bcb0>] file_update_time+0xa0/0xf0
[832441.920347] [<ffffffffc0437bd5>] xfs_file_aio_write_checks+0x185/0x1f0 [xfs]
[832441.921399] [<ffffffffc04381fa>] xfs_file_buffered_aio_write+0xca/0x2c0 [xfs]
[832441.922469] [<ffffffffc043857d>] xfs_file_aio_write+0x18d/0x1b0 [xfs]
[832441.923425] [<ffffffff9301e6a3>] do_sync_write+0x93/0xe0
[832441.924224] [<ffffffff9301f180>] vfs_write+0xc0/0x1f0
[832441.924990] [<ffffffff93020172>] SyS_pwrite64+0x92/0xc0
[832441.925777] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
[832441.926650] INFO: task ApplyBa.Journal:2178 blocked for more than 120 seconds.
[832441.927679] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832441.928816] ApplyBa.Journal D ffff8fba1dea0fd0 0 2178 1 0x00000080
[832441.929884] Call Trace:
[832441.930291] [<ffffffff93518db9>] schedule+0x29/0x70
[832441.931039] [<ffffffffc0456378>] _xfs_log_force_lsn+0x2e8/0x340 [xfs]
[832441.932089] [<ffffffff92ed1fe0>] ? wake_up_state+0x20/0x20
[832441.932960] [<ffffffffc0436b97>] xfs_file_fsync+0x107/0x1e0 [xfs]
[832441.933863] [<ffffffff930531b7>] do_fsync+0x67/0xb0
[832441.934604] [<ffffffff930534c3>] SyS_fdatasync+0x13/0x20
[832441.935443] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
[832441.936398] INFO: task oneagentloganal:30943 blocked for more than 120 seconds.
[832441.937490] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832441.938671] oneagentloganal D ffff8fba5eed1fa0 0 30943 28314 0x00000080
[832441.939771] Call Trace:
[832441.940161] [<ffffffff93518db9>] schedule+0x29/0x70
[832441.940896] [<ffffffff9351a3ed>] rwsem_down_read_failed+0x10d/0x1a0
[832441.941839] [<ffffffffc0444934>] ? xfs_ilock_attr_map_shared+0x34/0x40 [xfs]
[832441.942860] [<ffffffff9315f2b8>] call_rwsem_down_read_failed+0x18/0x30
[832441.943810] [<ffffffff93517ee0>] down_read+0x20/0x40
[832441.944559] [<ffffffffc044487c>] xfs_ilock+0xdc/0x120 [xfs]
[832441.945405] [<ffffffffc0444934>] xfs_ilock_attr_map_shared+0x34/0x40 [xfs]
[832441.946420] [<ffffffffc03f5958>] xfs_attr_get+0xd8/0x1a0 [xfs]
[832441.947291] [<ffffffffc0451b0d>] xfs_xattr_get+0x3d/0x80 [xfs]
[832441.948162] [<ffffffff930467b2>] generic_getxattr+0x52/0x70
[832441.949017] [<ffffffff930d2ee0>] get_vfs_caps_from_disk+0x70/0x180
[832441.949944] [<ffffffff92f2bd6d>] audit_copy_inode+0x6d/0xb0
[832441.950771] [<ffffffff92f329ba>] __audit_inode+0x18a/0x3c0
[832441.951583] [<ffffffff9302e7dc>] filename_lookup+0x7c/0xc0
[832441.952399] [<ffffffff93031bc7>] user_path_at_empty+0x67/0xc0
[832441.953282] [<ffffffff92f27b42>] ? from_kgid_munged+0x12/0x20
[832441.954131] [<ffffffff9302511f>] ? cp_new_stat+0x14f/0x180
[832441.954952] [<ffffffff93031c31>] user_path_at+0x11/0x20
[832441.955725] [<ffffffff93024c13>] vfs_fstatat+0x63/0xc0
[832441.956480] [<ffffffff9302517e>] SYSC_newstat+0x2e/0x60
[832441.957253] [<ffffffff9302545e>] SyS_newstat+0xe/0x10
[832441.958014] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
[832561.958824] INFO: task xfsaild/dm-7:576 blocked for more than 120 seconds.
[832561.959965] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832561.961151] xfsaild/dm-7 D ffff8fba5b759fa0 0 576 2 0x00000080
[832561.962300] Call Trace:
[832561.962718] [<ffffffff92ea697e>] ? try_to_del_timer_sync+0x5e/0x90
[832561.963687] [<ffffffff93518db9>] schedule+0x29/0x70
[832561.964491] [<ffffffffc0455e26>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[832561.965487] [<ffffffff92ed1fe0>] ? wake_up_state+0x20/0x20
[832561.966377] [<ffffffffc0461fdc>] ? xfsaild+0x16c/0x6f0 [xfs]
[832561.967285] [<ffffffffc0455f4c>] xfs_log_force+0x2c/0x70 [xfs]
[832561.968214] [<ffffffffc0461e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[832561.969340] [<ffffffffc0461fdc>] xfsaild+0x16c/0x6f0 [xfs]
[832561.970221] [<ffffffffc0461e70>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[832561.971330] [<ffffffff92ebdf21>] kthread+0xd1/0xe0
[832561.972101] [<ffffffff92ebde50>] ? insert_kthread_work+0x40/0x40
[832561.973057] [<ffffffff935255f7>] ret_from_fork_nospec_begin+0x21/0x21
[832561.974084] [<ffffffff92ebde50>] ? insert_kthread_work+0x40/0x40
[832561.975022] INFO: task mongod:1350 blocked for more than 120 seconds.
[832561.976004] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[832561.977188] mongod D ffff8fba3bf5dee0 0 1350 1 0x00000080
[832561.978309] Call Trace:
[832561.978721] [<ffffffff92ffaec2>] ? kmem_cache_alloc+0x1c2/0x1f0
[832561.979654] [<ffffffffc0452387>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[832561.980636] [<ffffffff93518db9>] schedule+0x29/0x70
[832561.981411] [<ffffffff9351a6a5>] rwsem_down_write_failed+0x225/0x3a0
[832561.982397] [<ffffffffc0452ac4>] ? xlog_grant_head_check+0x54/0x100 [xfs]
[832561.983455] [<ffffffffc0441abd>] ? xfs_vn_update_time+0xcd/0x150 [xfs]
[832561.984439] [<ffffffff9315f2e7>] call_rwsem_down_write_failed+0x17/0x30
[832561.985420] [<ffffffff93517f2d>] down_write+0x2d/0x3d
[832561.986217] [<ffffffffc0444864>] xfs_ilock+0xc4/0x120 [xfs]
[832561.987089] [<ffffffffc0441abd>] xfs_vn_update_time+0xcd/0x150 [xfs]
[832561.988193] [<ffffffff9303bb68>] update_time+0x28/0xd0
[832561.989015] [<ffffffff9303bcb0>] file_update_time+0xa0/0xf0
[832561.989893] [<ffffffffc0437bd5>] xfs_file_aio_write_checks+0x185/0x1f0 [xfs]
[832561.990981] [<ffffffffc04381fa>] xfs_file_buffered_aio_write+0xca/0x2c0 [xfs]
[832561.992112] [<ffffffffc043857d>] xfs_file_aio_write+0x18d/0x1b0 [xfs]
[832561.993131] [<ffffffff9301e6a3>] do_sync_write+0x93/0xe0
[832561.993957] [<ffffffff9301f180>] vfs_write+0xc0/0x1f0
[832561.994749] [<ffffffff93020172>] SyS_pwrite64+0x92/0xc0
[832561.995562] [<ffffffff9352579b>] system_call_fastpath+0x22/0x27
HxGao

HxGao

2021-11-19 09:32

reporter   ~0038743

We are affected too, on CentOS Linux release 7.6.1810 (kernel : 3.10.0-957.el7.x86_64 ) ,
happened on multiple mission-critical servers with XFS+LVM,
ManuelWolfshant

ManuelWolfshant

2021-11-19 10:28

manager   ~0038744

Both kernels 3.10.0-957.el7.x86_64 and 3.10.0-862.11.6.el7.x86_64 are not supported for many many years. Please update your installation ( and especially the kernelas we do not support cherrypicking updates ) to the latest available one ( currently kernel-3.10.0-1160.42.2.el7.x86_64 ) and please let us know if the problem still persists.
Anyway, CentOS is a rebuild of the sources used to create RHEL so please submit your request to Redhat via bugzilla.redhat.com and if/when RH accepts it and incorporates it into RHEL and releases a patched version, then CentOS will pick it up automatically.
tru

tru

2021-11-19 11:18

administrator   ~0038745

and use the centos AMI images from https://www.centos.org/download/aws-images/

Issue History

Date Modified Username Field Change
2018-10-22 15:14 toastboy New Issue
2018-10-22 15:14 toastboy Tag Attached: 7.5
2018-10-22 15:17 toastboy Tag Attached: xfs
2018-10-25 12:54 toastboy Note Added: 0032992
2021-11-19 08:31 HxGao Tag Attached: lvm
2021-11-19 08:31 HxGao Tag Detached: lvm
2021-11-19 08:31 HxGao Tag Attached: lvm
2021-11-19 09:32 HxGao Note Added: 0038743
2021-11-19 10:28 ManuelWolfshant Note Added: 0038744
2021-11-19 11:18 tru Note Added: 0038745