View Issue Details

IDProjectCategoryView StatusLast Update
0016711CentOS-7kernelpublic2019-11-09 08:44
Reporterbruceleeeee 
PriorityhighSeveritycrashReproducibilityrandom
Status newResolutionopen 
Platformx86_64OSCentOS Linux 7 (Core)OS Version7.7
Product Version7.7-1908 
Target VersionFixed in Version 
Summary0016711: xfsaild blocks after certain time
Descriptionproblem similar to https://bugs.centos.org/view.php?id=13843

kernel version: 3.10.0-1062.4.1.el7.x86_64

started to have problem where xfsaild becomes blocked after a period of time and processes wanting to access the file system begin to grind to a halt. Most often hits chatty IO applications such as auditd first. A reboot solves the problem until the next time. The file system is in a mirrored volume group and neither the volume group nor individual disk reports any problems.
Additional InformationNov 8 09:23:28 server1 kernel: [<ffffffff9957eb09>] schedule+0x29/0x70
Nov 8 09:23:28 server1 kernel: [<ffffffffc02661ed>] xlog_state_get_iclog_space+0x10d/0x320 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff993bcf1f>] ? get_target_pstate_use_performance+0x8f/0xc0
Nov 8 09:23:28 server1 kernel: [<ffffffff98eda190>] ? wake_up_state+0x20/0x20
Nov 8 09:23:28 server1 kernel: [<ffffffffc02668d9>] xlog_write+0x1a9/0x750 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc0263827>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc0268708>] xlog_cil_push+0x2a8/0x430 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc02688a5>] xlog_cil_push_work+0x15/0x20 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebd1df>] process_one_work+0x17f/0x440
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebe448>] worker_thread+0x278/0x3c0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebe1d0>] ? manage_workers.isra.26+0x2a0/0x2a0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec51b1>] kthread+0xd1/0xe0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec50e0>] ? insert_kthread_work+0x40/0x40
Nov 8 09:23:28 server1 kernel: [<ffffffff9958bd37>] ret_from_fork_nospec_begin+0x21/0x21
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec50e0>] ? insert_kthread_work+0x40/0x40
Nov 8 09:23:28 server1 kernel: INFO: task kworker/u32:2:23070 blocked for more than 120 seconds.
Nov 8 09:23:28 server1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 8 09:23:28 server1 kernel: kworker/u32:2 D ffff95e9fadea0e0 0 23070 2 0x00000080
Nov 8 09:23:28 server1 kernel: Workqueue: writeback bdi_writeback_workfn (flush-9:125)
Nov 8 09:23:28 server1 kernel: Call Trace:
Nov 8 09:23:28 server1 kernel: [<ffffffff9957eb09>] schedule+0x29/0x70
Nov 8 09:23:28 server1 kernel: [<ffffffff995804f5>] rwsem_down_read_failed+0x105/0x1c0
Nov 8 09:23:28 server1 kernel: [<ffffffffc023c047>] ? xfs_map_blocks+0x87/0x220 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff991913f8>] call_rwsem_down_read_failed+0x18/0x30
Nov 8 09:23:28 server1 kernel: [<ffffffff9957dc90>] down_read+0x20/0x40
Nov 8 09:23:28 server1 kernel: [<ffffffffc0255d69>] xfs_ilock+0xd9/0x120 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc023c047>] xfs_map_blocks+0x87/0x220 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc023d1a4>] xfs_do_writepage+0x174/0x550 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff98fc82ac>] write_cache_pages+0x21c/0x470
Nov 8 09:23:28 server1 kernel: [<ffffffffc023d030>] ? xfs_vm_writepages+0xa0/0xa0 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff9914f710>] ? submit_bio+0x70/0x150
Nov 8 09:23:28 server1 kernel: [<ffffffffc023cffb>] xfs_vm_writepages+0x6b/0xa0 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff98fc92f1>] do_writepages+0x21/0x50
Nov 8 09:23:28 server1 kernel: [<ffffffff99077f70>] __writeback_single_inode+0x40/0x260
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec6265>] ? wake_up_bit+0x25/0x30
Nov 8 09:23:28 server1 kernel: [<ffffffff99078b04>] writeback_sb_inodes+0x1c4/0x430
Nov 8 09:23:28 server1 kernel: [<ffffffff99078e0f>] __writeback_inodes_wb+0x9f/0xd0
Nov 8 09:23:28 server1 kernel: [<ffffffff990792f3>] wb_writeback+0x263/0x2f0
Nov 8 09:23:28 server1 kernel: [<ffffffff9906519c>] ? get_nr_inodes+0x4c/0x70
Nov 8 09:23:28 server1 kernel: [<ffffffff99079eeb>] bdi_writeback_workfn+0x2cb/0x460
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebd1df>] process_one_work+0x17f/0x440
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebe2f6>] worker_thread+0x126/0x3c0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebe1d0>] ? manage_workers.isra.26+0x2a0/0x2a0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec51b1>] kthread+0xd1/0xe0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec50e0>] ? insert_kthread_work+0x40/0x40
Nov 8 09:23:28 server1 kernel: [<ffffffff9958bd37>] ret_from_fork_nospec_begin+0x21/0x21
Nov 8 09:23:28 server1 kernel: [<ffffffff98ec50e0>] ? insert_kthread_work+0x40/0x40
Nov 8 09:23:28 server1 kernel: INFO: task dd:27364 blocked for more than 120 seconds.
Nov 8 09:23:28 server1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 8 09:23:28 server1 kernel: dd D ffff95e5cc4820e0 0 27364 1 0x00000080
Nov 8 09:23:28 server1 kernel: Call Trace:
Nov 8 09:23:28 server1 kernel: [<ffffffff9957eb09>] schedule+0x29/0x70
Nov 8 09:23:28 server1 kernel: [<ffffffff9957c491>] schedule_timeout+0x221/0x2d0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed6619>] ? ttwu_do_wakeup+0x19/0xe0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed674f>] ? ttwu_do_activate+0x6f/0x80
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed9ed0>] ? try_to_wake_up+0x190/0x390
Nov 8 09:23:28 server1 kernel: [<ffffffff9957eebd>] wait_for_completion+0xfd/0x140
Nov 8 09:23:28 server1 kernel: [<ffffffff98eda190>] ? wake_up_state+0x20/0x20
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebd96a>] flush_work+0x10a/0x1b0
Nov 8 09:23:28 server1 kernel: [<ffffffff98eba680>] ? move_linked_works+0x90/0x90
Nov 8 09:23:28 server1 kernel: [<ffffffffc02690da>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc02674a4>] _xfs_log_force_lsn+0x74/0x310 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff98fbb22f>] ? filemap_fdatawait_range+0x1f/0x30
Nov 8 09:23:28 server1 kernel: [<ffffffff9957dc82>] ? down_read+0x12/0x40
Nov 8 09:23:28 server1 kernel: [<ffffffffc0247a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff9907e2a7>] do_fsync+0x67/0xb0
Nov 8 09:23:28 server1 kernel: [<ffffffff9907e5b3>] SyS_fdatasync+0x13/0x20
Nov 8 09:23:28 server1 kernel: [<ffffffff9958bede>] system_call_fastpath+0x25/0x2a
Nov 8 09:23:28 server1 kernel: INFO: task exim:27365 blocked for more than 120 seconds.
Nov 8 09:23:28 server1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 8 09:23:28 server1 kernel: exim D ffff95e95a2e5230 0 27365 27362 0x00000080
Nov 8 09:23:28 server1 kernel: Call Trace:
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed56de>] ? resched_curr+0xae/0xc0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ee49a6>] ? check_preempt_wakeup+0x166/0x250
Nov 8 09:23:28 server1 kernel: [<ffffffff9957eb09>] schedule+0x29/0x70
Nov 8 09:23:28 server1 kernel: [<ffffffff9957c491>] schedule_timeout+0x221/0x2d0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed6619>] ? ttwu_do_wakeup+0x19/0xe0
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed674f>] ? ttwu_do_activate+0x6f/0x80
Nov 8 09:23:28 server1 kernel: [<ffffffff98ed9ed0>] ? try_to_wake_up+0x190/0x390
Nov 8 09:23:28 server1 kernel: [<ffffffff9957eebd>] wait_for_completion+0xfd/0x140
Nov 8 09:23:28 server1 kernel: [<ffffffff98eda190>] ? wake_up_state+0x20/0x20
Nov 8 09:23:28 server1 kernel: [<ffffffff98ebd96a>] flush_work+0x10a/0x1b0
Nov 8 09:23:28 server1 kernel: [<ffffffff98eba680>] ? move_linked_works+0x90/0x90
Nov 8 09:23:28 server1 kernel: [<ffffffffc02690da>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffffc02674a4>] _xfs_log_force_lsn+0x74/0x310 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff98fbb22f>] ? filemap_fdatawait_range+0x1f/0x30
Nov 8 09:23:28 server1 kernel: [<ffffffff9957dc82>] ? down_read+0x12/0x40
Nov 8 09:23:28 server1 kernel: [<ffffffffc0247a3d>] xfs_file_fsync+0xfd/0x1c0 [xfs]
Nov 8 09:23:28 server1 kernel: [<ffffffff9907e2a7>] do_fsync+0x67/0xb0
Nov 8 09:23:28 server1 kernel: [<ffffffff9907e590>] SyS_fsync+0x10/0x20
Nov 8 09:23:28 server1 kernel: [<ffffffff9958bede>] system_call_fastpath+0x25/0x2a

Nov 8 09:25:28 server1 kernel: INFO: task xfsaild/md125:820 blocked for more than 120 seconds.
Nov 8 09:25:28 server1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 8 09:25:28 server1 kernel: xfsaild/md125 D ffff95e9fbf69070 0 820 2 0x00000000
Nov 8 09:25:28 server1 kernel: Call Trace:
Nov 8 09:25:28 server1 kernel: [<ffffffff9957eb09>] schedule+0x29/0x70
Nov 8 09:25:28 server1 kernel: [<ffffffffc0267136>] _xfs_log_force+0x1c6/0x2a0 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffff98eda190>] ? wake_up_state+0x20/0x20
Nov 8 09:25:28 server1 kernel: [<ffffffffc0273550>] ? xfsaild+0x180/0x760 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc026723c>] xfs_log_force+0x2c/0x70 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc02733d0>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc0273550>] xfsaild+0x180/0x760 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc02733d0>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffff98ec51b1>] kthread+0xd1/0xe0
Nov 8 09:25:28 server1 kernel: [<ffffffff98ec50e0>] ? insert_kthread_work+0x40/0x40
Nov 8 09:25:28 server1 kernel: [<ffffffff9958bd37>] ret_from_fork_nospec_begin+0x21/0x21
Nov 8 09:25:28 server1 kernel: [<ffffffff98ec50e0>] ? insert_kthread_work+0x40/0x40
Nov 8 09:25:28 server1 kernel: INFO: task auditd:1348 blocked for more than 120 seconds.
Nov 8 09:25:28 server1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 8 09:25:28 server1 kernel: auditd D ffffffff9957e465 0 1348 1 0x00000000
Nov 8 09:25:28 server1 kernel: Call Trace:
Nov 8 09:25:28 server1 kernel: [<ffffffff9957eb09>] schedule+0x29/0x70
Nov 8 09:25:28 server1 kernel: [<ffffffff99580245>] rwsem_down_write_failed+0x215/0x3c0
Nov 8 09:25:28 server1 kernel: [<ffffffffc0263ee4>] ? xlog_grant_head_check+0x54/0x100 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc025307d>] ? xfs_vn_update_time+0xcd/0x150 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffff99191427>] call_rwsem_down_write_failed+0x17/0x30
Nov 8 09:25:28 server1 kernel: [<ffffffff9957dcdd>] down_write+0x2d/0x3d
Nov 8 09:25:28 server1 kernel: [<ffffffffc0255d51>] xfs_ilock+0xc1/0x120 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc025307d>] xfs_vn_update_time+0xcd/0x150 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffff990663a8>] update_time+0x28/0xd0
Nov 8 09:25:28 server1 kernel: [<ffffffff98ed2723>] ? __wake_up+0x13/0x20
Nov 8 09:25:28 server1 kernel: [<ffffffff990664f0>] file_update_time+0xa0/0xf0
Nov 8 09:25:28 server1 kernel: [<ffffffffc0248adf>] xfs_file_aio_write_checks+0x16f/0x1c0 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc024935a>] xfs_file_buffered_aio_write+0xca/0x2c0 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffffc02496dd>] xfs_file_aio_write+0x18d/0x1b0 [xfs]
Nov 8 09:25:28 server1 kernel: [<ffffffff99048433>] do_sync_write+0x93/0xe0
Nov 8 09:25:28 server1 kernel: [<ffffffff99048f20>] vfs_write+0xc0/0x1f0
Nov 8 09:25:28 server1 kernel: [<ffffffff99049d3f>] SyS_write+0x7f/0xf0
Nov 8 09:25:28 server1 kernel: [<ffffffff9958bede>] system_call_fastpath+0x25/0x2a
Tags3.10.0-1062.1.1.el7.x86_64, 7.7, centos 7, file system
abrt_hash
URL

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2019-11-09 08:44 bruceleeeee New Issue
2019-11-09 08:44 bruceleeeee Tag Attached: 3.10.0-1062.1.1.el7.x86_64
2019-11-09 08:44 bruceleeeee Tag Attached: 7.7
2019-11-09 08:44 bruceleeeee Tag Attached: centos 7
2019-11-09 08:44 bruceleeeee Tag Attached: file system