View Issue Details

IDProjectCategoryView StatusLast Update
0013843CentOS-7kernelpublic2018-09-15 17:32
Reporterthompsop 
PriorityhighSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Platformx86_64OSOS Version7.4.1708
Product Version7.3.1611 
Target VersionFixed in Version 
Summary0013843: xfsaild blocks after certain time
Descriptionvmlinuz-3.10.0-693.2.2.el7.x86_64, started to have problem where xfsaild becomes blocked after a period of time and processes wanting to access the file system begin to grind to a halt. Most often hits chatty IO applications such as auditd first. A reboot solves the problem until the next time. The file system is in a mirrored volume group and neither the volume group nor individual disk reports any problems. Reverting to vmlinuz-3.10.0-514.26.2.el7.x86_64 seems to make the problem go away.
Steps To ReproduceRun vmlinuz-3.10.0-693.2.2.el7.x86_64 and wait a variable amount of time, usually less than 12 hours.
Additional InformationSep 15 02:23:36 lol1093 kernel: INFO: task touch:9955 blocked for more than 120 seconds.
Sep 15 02:23:36 lol1093 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 15 02:23:36 lol1093 kernel: touch D ffff8801fbffcf10 0 9955 9951 0x00000080
Sep 15 02:23:36 lol1093 kernel: ffff88017306f6f0 0000000000000082 ffff8801fbffcf10 ffff88017306ffd8
Sep 15 02:23:36 lol1093 kernel: ffff88017306ffd8 ffff88017306ffd8 ffff8801fbffcf10 ffff880210e307b0
Sep 15 02:23:36 lol1093 kernel: 7fffffffffffffff 0000000000000002 0000000000000000 ffff8801fbffcf10
Sep 15 02:23:36 lol1093 kernel: Call Trace:
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816a94e9>] schedule+0x29/0x70
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816a6ff9>] schedule_timeout+0x239/0x2c0
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816a8887>] __down_common+0xaa/0x104
Sep 15 02:23:36 lol1093 kernel: [<ffffffff810d1223>] ? find_busiest_group+0x143/0x980
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0444410>] ? _xfs_buf_find+0x170/0x330 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816a88fe>] __down+0x1d/0x1f
Sep 15 02:23:36 lol1093 kernel: [<ffffffff810b6691>] down+0x41/0x50
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc044420c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0444410>] _xfs_buf_find+0x170/0x330 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc04445fa>] xfs_buf_get_map+0x2a/0x240 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc04451a0>] xfs_buf_read_map+0x30/0x160 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0474ec1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0434bad>] xfs_read_agi+0x9d/0x110 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0434c54>] xfs_ialloc_read_agi+0x34/0xd0 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc04356a8>] xfs_dialloc+0xe8/0x280 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0457581>] xfs_ialloc+0x71/0x530 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0464b94>] ? xlog_grant_head_check+0x54/0x100 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc0457ab3>] xfs_dir_ialloc+0x73/0x1f0 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816a87b2>] ? down_write+0x12/0x3d
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc045837e>] xfs_create+0x43e/0x6c0 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc045522b>] xfs_vn_mknod+0xbb/0x240 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffffc04553e3>] xfs_vn_create+0x13/0x20 [xfs]
Sep 15 02:23:36 lol1093 kernel: [<ffffffff8120d60d>] vfs_create+0xcd/0x130
Sep 15 02:23:36 lol1093 kernel: [<ffffffff8121079a>] do_last+0x10ea/0x12c0
Sep 15 02:23:36 lol1093 kernel: [<ffffffff81210a32>] path_openat+0xc2/0x490
Sep 15 02:23:36 lol1093 kernel: [<ffffffff8118295b>] ? unlock_page+0x2b/0x30
Sep 15 02:23:36 lol1093 kernel: [<ffffffff811ad6a6>] ? do_read_fault.isra.44+0xe6/0x130
Sep 15 02:23:36 lol1093 kernel: [<ffffffff81212fcb>] do_filp_open+0x4b/0xb0
Sep 15 02:23:36 lol1093 kernel: [<ffffffff8111f757>] ? __audit_getname+0x97/0xb0
Sep 15 02:23:36 lol1093 kernel: [<ffffffff8122022a>] ? __alloc_fd+0x8a/0x130
Sep 15 02:23:36 lol1093 kernel: [<ffffffff811ffc13>] do_sys_open+0xf3/0x1f0
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816b0325>] ? do_page_fault+0x35/0x90
Sep 15 02:23:36 lol1093 kernel: [<ffffffff811ffd2e>] SyS_open+0x1e/0x20
Sep 15 02:23:36 lol1093 kernel: [<ffffffff816b5009>] system_call_fastpath+0x16/0x1b
Tags7.4, file system, upstream, xfs
abrt_hash
URL

Activities

thompsop

thompsop

2017-10-18 11:34

reporter   ~0030400

Here's a similar, recent, but not identical problem
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1681914
thompsop

thompsop

2017-10-25 21:44

reporter   ~0030468

Problem persists on 3.10.0-693.5.2.el7.x86_64, must revert to 3.10.0-514.26.2.el7.x86_64 and the machine will run fine for weeks.
Neil Mukerji

Neil Mukerji

2017-12-03 10:21

reporter   ~0030693

I'd like to add that I'm experiencing this issue on a number (10+) of machines. My experiments also show that the instability started with Kernel 3.10.0-693.2.2.el7.x86_64 and persist with 3.10.0-693.5.2.el7.x86_64; regressing to Kernel 3.10.0-514.26.2.el7.x86_64 brings stability.

We run software RAID 1 and usually the 10G /tmp partition is the one to lock up. The issue only occurs with SSD drives; our servers with traditional drives never encounter this problem. We use systemd's weekly timer to trim the disks, and we have tried also trimming daily in case that was relevant. Anecdotally it feels like these trims have reduced the frequency of xfsaild lock-ups, but they are still occurring. The only fix we've found so far is to regress the kernel, which isn't okay in the long term.
thompsop

thompsop

2017-12-03 18:29

reporter   ~0030694

I have found running the current debug kernel masks the problem.
3.10.0-693.5.2.el7.x86_64.debug is stable for me, for instance.
However if the machine accidentally boots back to the normal kernel the problem returns.
gregb

gregb

2017-12-15 21:49

reporter   ~0030761

I'm not sure if this is related, but we just had this happen on one of our machines today - a task became blocked by xfsaild, and logins to the machine became blocked at that point too; reboot was required to fix.

This may be an upstream XFS issue -- we were trying a plain vanilla 4.14.6 kernel at the time because of a different XFS issue (the plain vanilla kernel did not fix the other issue, but this lockup happened).
smuku@salesforce.com

smuku@salesforce.com

2018-01-05 14:14

reporter   ~0030855

We are seeing the exact behaviour on multiple hosts running this kernel with processes getting blocked by xfsaild.
gbeeley

gbeeley

2018-01-05 15:13

reporter   ~0030856

I was able to reproduce this (or a similar) issue on demand with 4.14.6 through the following procedure:

0. Be running vanilla 4.14.6.
1. Have a mounted and active XFS filesystem.
2. Go to /sys/devices/system/cpu and disable all but cpu0 by writing "0" to cpuXX/online for each cpu other than cpu0.
3. Re-enable one other cpu by writing "1" to cpuXX/online for that cpu. The selection of which cpu to re-enable *might* depend on hyperthreading and/or NUMA details, but I have not confirmed that yet.
4. Observe system lockup apparently due to xfsaild.
joerg

joerg

2018-01-22 12:11

reporter   ~0030981

I am witnessing the same failure after updating from
vmlinuz-3.10.0-514.16.1.el7.x86_64
to
vmlinuz-3.10.0-693.11.6.el7.x86_64

We are also running on a mdadm-raid1:

[284165.322308] INFO: task xfsaild/md127:525 blocked for more than 120 seconds.
[284165.322667] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284165.323070] xfsaild/md127 D ffff881039386eb0 0 525 2 0x00000000
[284165.323081] Call Trace:
[284165.323100] [<ffffffff8109ae1e>] ? try_to_del_timer_sync+0x5e/0x90
[284165.323115] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284165.323307] [<ffffffffc0385d36>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[284165.323339] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284165.323503] [<ffffffffc0391bbc>] ? xfsaild+0x16c/0x6f0 [xfs]
[284165.323685] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284165.323868] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284165.324048] [<ffffffffc0391bbc>] xfsaild+0x16c/0x6f0 [xfs]
[284165.324235] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284165.324264] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284165.324278] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284165.324294] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284165.324310] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284285.326264] INFO: task xfsaild/md127:525 blocked for more than 120 seconds.
[284285.326594] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284285.326912] xfsaild/md127 D ffff881039386eb0 0 525 2 0x00000000
[284285.326923] Call Trace:
[284285.326942] [<ffffffff8109ae1e>] ? try_to_del_timer_sync+0x5e/0x90
[284285.326956] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284285.327139] [<ffffffffc0385d36>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[284285.327169] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284285.327322] [<ffffffffc0391bbc>] ? xfsaild+0x16c/0x6f0 [xfs]
[284285.327479] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284285.327652] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284285.327820] [<ffffffffc0391bbc>] xfsaild+0x16c/0x6f0 [xfs]
[284285.327981] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284285.327999] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284285.328013] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284285.328027] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284285.328044] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284405.330013] INFO: task xfsaild/md127:525 blocked for more than 120 seconds.
[284405.330297] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284405.330669] xfsaild/md127 D ffff881039386eb0 0 525 2 0x00000000
[284405.330679] Call Trace:
[284405.330697] [<ffffffff8109ae1e>] ? try_to_del_timer_sync+0x5e/0x90
[284405.330710] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284405.330927] [<ffffffffc0385d36>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[284405.330941] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284405.331106] [<ffffffffc0391bbc>] ? xfsaild+0x16c/0x6f0 [xfs]
[284405.331846] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284405.332104] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284405.332412] [<ffffffffc0391bbc>] xfsaild+0x16c/0x6f0 [xfs]
[284405.332642] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284405.332742] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284405.332755] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284405.332770] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284405.332782] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284405.332896] INFO: task kworker/12:2:18723 blocked for more than 120 seconds.
[284405.333359] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284405.333880] kworker/12:2 D ffff881029beaf70 0 18723 2 0x00000000
[284405.334039] Workqueue: xfs-cil/md127 xlog_cil_push_work [xfs]
[284405.334127] Call Trace:
[284405.334143] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284405.334302] [<ffffffffc0384ffd>] xlog_state_get_iclog_space+0x10d/0x320 [xfs]
[284405.334462] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284405.334623] [<ffffffffc0385509>] xlog_write+0x1a9/0x720 [xfs]
[284405.335075] [<ffffffffc0382477>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[284405.335479] [<ffffffffc0387218>] xlog_cil_push+0x2a8/0x430 [xfs]
[284405.335655] [<ffffffffc03873b5>] xlog_cil_push_work+0x15/0x20 [xfs]
[284405.335668] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[284405.335682] [<ffffffff810ab1d8>] worker_thread+0x278/0x3c0
[284405.335695] [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[284405.335707] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284405.335720] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284405.335732] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284405.335744] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284405.335755] INFO: task kworker/12:1:20525 blocked for more than 120 seconds.
[284405.336031] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284405.336324] kworker/12:1 D ffff881029beb774 0 20525 2 0x00000000
[284405.336487] Workqueue: xfs-sync/md127 xfs_log_worker [xfs]
[284405.336493] Call Trace:
[284405.336511] [<ffffffff8150c3a4>] ? md_make_request+0x104/0x290
[284405.336524] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284405.336538] [<ffffffff816a90e9>] schedule_timeout+0x239/0x2c0
[284405.336568] [<ffffffffc000d6d0>] ? ahci_qc_prep+0x150/0x1c0 [libahci]
[284405.336644] [<ffffffffc00ad957>] ? ata_qc_issue+0x177/0x3c0 [libata]
[284405.336661] [<ffffffff8109a5cb>] ? __internal_add_timer+0xab/0x130
[284405.336673] [<ffffffff816aba8d>] wait_for_completion+0xfd/0x140
[284405.336686] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284405.336702] [<ffffffff810ab41d>] flush_work+0xfd/0x190
[284405.336714] [<ffffffff810a7990>] ? move_linked_works+0x90/0x90
[284405.336883] [<ffffffffc0387b3a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[284405.336901] [<ffffffff811a3206>] ? next_online_pgdat+0x26/0x60
[284405.337069] [<ffffffffc0385bf5>] _xfs_log_force+0x85/0x2c0 [xfs]
[284405.337092] [<ffffffff810298b1>] ? __switch_to+0x431/0x500
[284405.337255] [<ffffffffc0385ec4>] ? xfs_log_worker+0x24/0x50 [xfs]
[284405.337426] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284405.337587] [<ffffffffc0385ec4>] xfs_log_worker+0x24/0x50 [xfs]
[284405.337604] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[284405.337617] [<ffffffff810ab086>] worker_thread+0x126/0x3c0
[284405.337631] [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[284405.337645] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284405.337659] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284405.337675] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284405.337687] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284525.338947] INFO: task xfsaild/md127:525 blocked for more than 120 seconds.
[284525.339306] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284525.339709] xfsaild/md127 D ffff881039386eb0 0 525 2 0x00000000
[284525.339720] Call Trace:
[284525.339739] [<ffffffff8109ae1e>] ? try_to_del_timer_sync+0x5e/0x90
[284525.339753] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284525.339961] [<ffffffffc0385d36>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[284525.339993] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284525.340159] [<ffffffffc0391bbc>] ? xfsaild+0x16c/0x6f0 [xfs]
[284525.340332] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284525.340508] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284525.340689] [<ffffffffc0391bbc>] xfsaild+0x16c/0x6f0 [xfs]
[284525.340858] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284525.340886] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284525.340904] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284525.340920] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284525.340937] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284525.340984] INFO: task kworker/12:2:18723 blocked for more than 120 seconds.
[284525.341276] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284525.341594] kworker/12:2 D ffff881029beaf70 0 18723 2 0x00000000
[284525.341770] Workqueue: xfs-cil/md127 xlog_cil_push_work [xfs]
[284525.341776] Call Trace:
[284525.341791] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284525.341959] [<ffffffffc0384ffd>] xlog_state_get_iclog_space+0x10d/0x320 [xfs]
[284525.341988] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284525.342163] [<ffffffffc0385509>] xlog_write+0x1a9/0x720 [xfs]
[284525.342350] [<ffffffffc0382477>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[284525.342537] [<ffffffffc0387218>] xlog_cil_push+0x2a8/0x430 [xfs]
[284525.342726] [<ffffffffc03873b5>] xlog_cil_push_work+0x15/0x20 [xfs]
[284525.342744] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[284525.342759] [<ffffffff810ab1d8>] worker_thread+0x278/0x3c0
[284525.342776] [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[284525.342789] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284525.342805] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284525.342818] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284525.342832] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284525.342861] INFO: task kworker/12:1:20525 blocked for more than 120 seconds.
[284525.343223] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284525.343546] kworker/12:1 D ffff881029beb774 0 20525 2 0x00000000
[284525.343717] Workqueue: xfs-sync/md127 xfs_log_worker [xfs]
[284525.343722] Call Trace:
[284525.343738] [<ffffffff8150c3a4>] ? md_make_request+0x104/0x290
[284525.343751] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284525.343763] [<ffffffff816a90e9>] schedule_timeout+0x239/0x2c0
[284525.343795] [<ffffffffc000d6d0>] ? ahci_qc_prep+0x150/0x1c0 [libahci]
[284525.343874] [<ffffffffc00ad957>] ? ata_qc_issue+0x177/0x3c0 [libata]
[284525.343901] [<ffffffff8109a5cb>] ? __internal_add_timer+0xab/0x130
[284525.343918] [<ffffffff816aba8d>] wait_for_completion+0xfd/0x140
[284525.343931] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284525.343946] [<ffffffff810ab41d>] flush_work+0xfd/0x190
[284525.343962] [<ffffffff810a7990>] ? move_linked_works+0x90/0x90
[284525.344131] [<ffffffffc0387b3a>] xlog_cil_force_lsn+0x8a/0x210 [xfs]
[284525.344150] [<ffffffff811a3206>] ? next_online_pgdat+0x26/0x60
[284525.344320] [<ffffffffc0385bf5>] _xfs_log_force+0x85/0x2c0 [xfs]
[284525.344335] [<ffffffff810298b1>] ? __switch_to+0x431/0x500
[284525.344505] [<ffffffffc0385ec4>] ? xfs_log_worker+0x24/0x50 [xfs]
[284525.344694] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284525.344879] [<ffffffffc0385ec4>] xfs_log_worker+0x24/0x50 [xfs]
[284525.344902] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[284525.344920] [<ffffffff810ab086>] worker_thread+0x126/0x3c0
[284525.344935] [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[284525.344951] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284525.344966] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284525.344979] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284525.344996] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284645.346805] INFO: task xfsaild/md127:525 blocked for more than 120 seconds.
[284645.347170] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284645.347569] xfsaild/md127 D ffff881039386eb0 0 525 2 0x00000000
[284645.347581] Call Trace:
[284645.347599] [<ffffffff8109ae1e>] ? try_to_del_timer_sync+0x5e/0x90
[284645.347613] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284645.347821] [<ffffffffc0385d36>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[284645.347852] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284645.348018] [<ffffffffc0391bbc>] ? xfsaild+0x16c/0x6f0 [xfs]
[284645.348189] [<ffffffffc0385e5c>] xfs_log_force+0x2c/0x70 [xfs]
[284645.348370] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284645.348550] [<ffffffffc0391bbc>] xfsaild+0x16c/0x6f0 [xfs]
[284645.348734] [<ffffffffc0391a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[284645.348761] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284645.348778] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284645.348794] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284645.348807] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284645.348855] INFO: task kworker/12:2:18723 blocked for more than 120 seconds.
[284645.349145] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[284645.349541] kworker/12:2 D ffff881029beaf70 0 18723 2 0x00000000
[284645.349711] Workqueue: xfs-cil/md127 xlog_cil_push_work [xfs]
[284645.349731] Call Trace:
[284645.349751] [<ffffffff816ab6d9>] schedule+0x29/0x70
[284645.349923] [<ffffffffc0384ffd>] xlog_state_get_iclog_space+0x10d/0x320 [xfs]
[284645.349940] [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
[284645.350118] [<ffffffffc0385509>] xlog_write+0x1a9/0x720 [xfs]
[284645.350294] [<ffffffffc0382477>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[284645.350482] [<ffffffffc0387218>] xlog_cil_push+0x2a8/0x430 [xfs]
[284645.350654] [<ffffffffc03873b5>] xlog_cil_push_work+0x15/0x20 [xfs]
[284645.350673] [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
[284645.350687] [<ffffffff810ab1d8>] worker_thread+0x278/0x3c0
[284645.350703] [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
[284645.350730] [<ffffffff810b252f>] kthread+0xcf/0xe0
[284645.350744] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
[284645.350761] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[284645.350774] [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
thompsop

thompsop

2018-01-28 16:05

reporter   ~0031076

created
https://bugzilla.redhat.com/show_bug.cgi?id=1539394
TrevorH

TrevorH

2018-01-28 17:00

manager   ~0031077

Please note that there are possible fixes for this in the latest CentOS 7 kernel. Running rpm -q --changelog kernel-3.10.0-693.17.1.el7.x86_64 shows several possible xfs fixes that might be the solution to this. Please test and feedback.
thompsop

thompsop

2018-01-29 14:54

reporter   ~0031084

Copying in a comment from the redhat bug. I believe is he is specifically referencing gbeeley's steps to reproduce and also joerg's call trace.

--- Comment #2 from Brian Foster <bfoster@redhat.com> ---
The xfsaild traces in the bug referenced above show us waiting on a log force.
That tends to imply we're waiting on I/O. The stacks also show md appears to be
involved. Since you have a reproducer, could you rule out md from the test if
that hasn't been done already?

FWIW, I've been running an fsstress/cpu [on|off]line loop against a dummy
raid1+XFS device on 4.15.0-rc9 for a few minutes and haven't seen any issues so
far. Is 4.14.6 definitely required or has anything newer than that simply not
been checked for the problem?
joerg

joerg

2018-01-29 15:20

reporter   ~0031087

I am still on 3.10.0-693.11.6.el7 , but we restored our system (kvm hypervisor) with ext4 on the same MD-raid1 and are now running without issues.

I have a twin system (same hardware, ...) in the same network still running MD-raid1 with xfs as /, currently without witnessing any problem.
diito

diito

2018-01-29 18:08

reporter   ~0031091

I seem to have this problem with a CentOS 7.4 VM running on a Fedora 27 KVM host. I've switched to the 3.10.0-514.26.2.el7.x86_64 kernel on the VM and that DID NOT resolve this, it only seems to have delayed it a few days at best. The VM itself sits on a xfs partition on a nvme (Samsung 960 PRO m.2) so it's plenty fast enough to keep up with anything. Previously the VM was running on a ext4 partition on an SSD in another Fedora 27 system without issues. Besides that the previous system was Intel and the new AMD. Another CentOS based VM I migrated that's basically idle all day does NOT have this issue.

Has anyone seen where the host impacts this in anyway? I'm trying to figure out why xfsaild would fail in this manor within the VM on my new host and not my old as it's not been altered in any way.
thompsop

thompsop

2018-01-30 02:43

reporter   ~0031100

[94443.457717] [<ffffffff81235765>] do_fsync+0x65/0xa0
[94443.457718] [<ffffffff81235a40>] SyS_fsync+0x10/0x20
[94443.457720] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[94563.461909] INFO: task xfsaild/dm-4:502 blocked for more than 120 seconds.
[94563.463173] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[94563.464414] xfsaild/dm-4 D ffff88020f610fd0 0 502 2 0x00000000
[94563.464416] Call Trace:
[94563.464422] [<ffffffff8109affe>] ? try_to_del_timer_sync+0x5e/0x90
[94563.464425] [<ffffffff816ab8a9>] schedule+0x29/0x70
[94563.464460] [<ffffffffc0464d46>] _xfs_log_force+0x1c6/0x2c0 [xfs]
[94563.464464] [<ffffffff810c6620>] ? wake_up_state+0x20/0x20
[94563.464480] [<ffffffffc0470bcc>] ? xfsaild+0x16c/0x6f0 [xfs]
[94563.464495] [<ffffffffc0464e6c>] xfs_log_force+0x2c/0x70 [xfs]
[94563.464509] [<ffffffffc0470a60>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[94563.464522] [<ffffffffc0470bcc>] xfsaild+0x16c/0x6f0 [xfs]
[94563.464535] [<ffffffffc0470a60>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
[94563.464538] [<ffffffff810b270f>] kthread+0xcf/0xe0
[94563.464540] [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
[94563.464543] [<ffffffff816b8798>] ret_from_fork+0x58/0x90
[94563.464544] [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
[94563.464550] INFO: task auditd:1002 blocked for more than 120 seconds.
[94563.466275] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[94563.468135] auditd D ffff8802123e6eb0 0 1002 1 0x00000000
[94563.468138] Call Trace:
[94563.468161] [<ffffffffc0461487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[94563.468165] [<ffffffff816ab8a9>] schedule+0x29/0x70
[94563.468167] [<ffffffff816ad195>] rwsem_down_write_failed+0x225/0x3a0
[94563.468189] [<ffffffffc0461bc4>] ? xlog_grant_head_check+0x54/0x100 [xfs]
[94563.468210] [<ffffffffc0450c9d>] ? xfs_vn_update_time+0xcd/0x150 [xfs]
[94563.468213] [<ffffffff81333fd7>] call_rwsem_down_write_failed+0x17/0x30
[94563.468216] [<ffffffff816aaa8d>] down_write+0x2d/0x3d
[94563.468236] [<ffffffffc04531e1>] xfs_ilock+0xc1/0x120 [xfs]
[94563.468254] [<ffffffffc0450c9d>] xfs_vn_update_time+0xcd/0x150 [xfs]
[94563.468258] [<ffffffff8121efa5>] update_time+0x25/0xd0
[94563.468261] [<ffffffff8121f0f0>] file_update_time+0xa0/0xf0
[94563.468280] [<ffffffffc04473b5>] xfs_file_aio_write_checks+0x185/0x1f0 [xfs]
[94563.468298] [<ffffffffc0447a5a>] xfs_file_buffered_aio_write+0xca/0x2c0 [xfs]
[94563.468316] [<ffffffffc0447ddd>] xfs_file_aio_write+0x18d/0x1b0 [xfs]
[94563.468319] [<ffffffff8120222d>] do_sync_write+0x8d/0xd0
[94563.468323] [<ffffffff81202ced>] vfs_write+0xbd/0x1e0
[94563.468325] [<ffffffff81203aff>] SyS_write+0x7f/0xe0
[94563.468328] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
[94563.468330] INFO: task auditd:1003 blocked for more than 120 seconds.
[94563.470184] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[94563.472067] auditd D ffff8802135f0000 0 1003 1 0x00000000
[94563.472086] Call Trace:
[94563.472089] [<ffffffff816ab8a9>] schedule+0x29/0x70
thompsop

thompsop

2018-01-30 02:44

reporter   ~0031101

Running one day of uptime on 3.10.0-693.17.1.el7.x86_64 problem recurred. No md. I was unable to reproduce using the CPU offline recipe, unless it took effect some hours after the experiment.
thompsop

thompsop

2018-01-30 13:54

reporter   ~0031109

From redhat bug: sosreport, block task output, kdump:

Could you please describe the storage (devices, I/O schedulers, etc.), filesystem (xfs_info) and workload involved? Note that much of the former would be included in a sosreport.

Also, could you provide the complete blocked task output while in this state ('echo w > /proc/sysrq-trigger')? FWIW, a kdump wouldn't hurt as well if kdump is configured.
thompsop

thompsop

2018-01-30 15:42

reporter   ~0031110

Bam, bug closed:
--- Comment #13 from Brian Foster <bfoster@redhat.com> ---
All signs point to this as a duplicate of bug 1493585. Feel free to reopen if a
hang variant is reproduced that does not involve md.

*** This bug has been marked as a duplicate of bug 1493585 ***
TrevorH

TrevorH

2018-01-30 16:15

manager   ~0031111

Which is yet another invisible RHEL bug :-(

Does anyone have the ability to read that and add email addresses to the cc list?
thompsop

thompsop

2018-01-30 16:36

reporter   ~0031113

I asked for exactly that - no reply to date
diito

diito

2018-01-30 18:06

reporter   ~0031119

> Feel free to reopen if a hang variant is reproduced that does not involve md.

My issue does not involve md. It also happens on SSD's.
thompsop

thompsop

2018-01-30 19:55

reporter   ~0031121

"Bug 1493585 is in the verified state and has already been fixed in
MD in kernel-3.10.0-820.el7."
joerg

joerg

2018-01-30 19:55

reporter   ~0031122

Had this one left from my January hang up:
# ps aux | grep D
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 124 0.0 0.0 0 0 ? DN Jan09 0:02 [khugepaged]
root 538 0.0 0.0 0 0 ? D Jan09 2:11 [xfsaild/md127]
root 723 0.0 0.0 128428 1492 ? Ds Jan13 0:00 /bin/systemctl status snmpd.service
root 842 0.0 0.0 55452 1024 ? D<sl Jan09 0:08 /sbin/auditd
root 936 0.0 0.0 201248 1112 ? Ssl Jan09 0:00 /usr/sbin/gssproxy -D
root 1358 0.0 0.0 150220 3412 ? Dsl Jan09 3:57 /usr/bin/monit -I
root 1375 0.0 0.0 105996 3728 ? Ss Jan09 0:05 /usr/sbin/sshd -D
root 1852 0.0 0.0 178116 2508 ? S Jan16 0:00 /usr/sbin/CROND -n
root 1864 0.0 0.0 89616 3912 ? S Jan16 0:00 /usr/sbin/sendmail -FCronDaemon -i -odi -oem -oi -t -f root
root 1879 0.0 0.0 89584 3908 ? D Jan16 0:00 /usr/sbin/postdrop -r
thompsop

thompsop

2018-01-30 20:07

reporter   ~0031123

Regarding diito's items, I would suggest appending sosreport, block task output, kdump to this ticket and once here I will reopen the redhat ticket. Otherwise you could jump through the hoops and get a redhat bugzilla login and append those items there. Or wait until -820.

I was not completely certain I was correct when I told redhat that MD was involved in my hangs. I am not explicity using the MD raid tools but in the blocked task report mdX_raid1 showed with a very surprising high wait-time for a module I did not believe I was using. If the -820 kernel did not fix me I was going to reopen the ticket.
slayerduck

slayerduck

2018-01-31 12:05

reporter   ~0031129

I'm experiencing the same problems, my entire array got stuck and i had to reset the server. I was on 3.10.0-693.11.6.el7.x86_64 and now upgraded to 3.10.0-693.17.1.el7.x86_64 but thompsop said it recurred for him so i don't doubt its going to happen again for me.

Using an array of SSD's with mdadm

Jan 31 06:36:01 vita kernel: INFO: task xfsaild/md0:1190 blocked for more than 120 seconds.
Jan 31 06:36:01 vita kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 31 06:36:01 vita kernel: xfsaild/md0 D ffff88203d901fa0 0 1190 2 0x00000000
Jan 31 06:36:01 vita kernel: Call Trace:
Jan 31 06:36:01 vita kernel: [<ffffffff816ab6d9>] schedule+0x29/0x70
Jan 31 06:36:01 vita kernel: [<ffffffffc02c7d36>] _xfs_log_force+0x1c6/0x2c0 [xfs]
Jan 31 06:36:01 vita kernel: [<ffffffff810c6440>] ? wake_up_state+0x20/0x20
Jan 31 06:36:01 vita kernel: [<ffffffffc02d3bbc>] ? xfsaild+0x16c/0x6f0 [xfs]
Jan 31 06:36:01 vita kernel: [<ffffffffc02c7e5c>] xfs_log_force+0x2c/0x70 [xfs]
Jan 31 06:36:01 vita kernel: [<ffffffffc02d3a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jan 31 06:36:01 vita kernel: [<ffffffffc02d3bbc>] xfsaild+0x16c/0x6f0 [xfs]
Jan 31 06:36:01 vita kernel: [<ffffffffc02d3a50>] ? xfs_trans_ail_cursor_first+0x90/0x90 [xfs]
Jan 31 06:36:01 vita kernel: [<ffffffff810b252f>] kthread+0xcf/0xe0
Jan 31 06:36:01 vita kernel: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
Jan 31 06:36:01 vita kernel: [<ffffffff816b8798>] ret_from_fork+0x58/0x90
Jan 31 06:36:01 vita kernel: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
slayerduck

slayerduck

2018-01-31 12:26

reporter   ~0031130

Sorry for double post but is there an ETA to the release of 3.10.0-820.el7 or somewhere where i can grab it?
justin2net

justin2net

2018-02-23 02:44

reporter   ~0031294

I'm seeing this on 3.10.0-514.26.2 as well, like @dlito. Hypervisor is FC27 and VM has a XFS system partition on an qemu image that is sitting on a NVME SSD.
justin2net

justin2net

2018-02-23 02:45

reporter   ~0031295

(I should add that I also tried kernel -830 and still experienced the same issue.)
alphacc

alphacc

2018-02-23 10:06

developer   ~0031299

@thompsop

We have a similar issue too. We dont have a reproducer yet.

I am trying -830 too but there is no mention of 1493585 in the changelog. Any other pointer as the bugzilla is private ?

$ rpm -qp --changelog kernel-3.10.0-830.el7.x86_64.rpm | grep 1493585
thompsop

thompsop

2018-02-27 12:49

reporter   ~0031334

Have not tried-830 kernel yet myself, still using the latest debug kernel. I'd imagine they would need this from the -830 version:
sosreport, block task output, kdump:

Redhat: Could you please describe the storage (devices, I/O schedulers, etc.), filesystem (xfs_info) and workload involved? Note that much of the former would be included in a sosreport.

Redhat: Also, could you provide the complete blocked task output while in this state ('echo w > /proc/sysrq-trigger')? FWIW, a kdump wouldn't hurt as well if kdump is configured.
justin2net

justin2net

2018-03-06 00:38

reporter   ~0031360

@thompsop

Thanks for your reply. Could you request to Redhat that their bugzilla thread be made public? Please let them know there are other users who are experiencing this (link them to here).

The system is completely hung hen in this state, so only a hard reset brings it back to life. In other words, I can't run 'echo w > /proc/sysrq-trigger' as you suggest when it is hung...

I changed the hypervisor kernel to the latest in the 4.16 rc series which seems to have delayed the "hanging" until a few days of uptime.

@diito

Are you still experiencing this? Can you provide some more information about the system? Hypervisor, hypervisor kernel, QEMU image format, etc... Thanks.
thompsop

thompsop

2018-03-06 02:00

reporter   ~0031361

@justin2net
I mentioned to redhat a public bug would be a great benefit. They promptly answered that the bug was likely logged originally from a big, widely recognized company name with a fancy paid support contract who does not want possibly proprietary information contained in the non-confidential parts of the bug to be exposed to dog & world.

That said, I am waiting for the Centos version of the supposedly fixed version to drop so I can test it, and if I find it wanting report these facts back to the redhat bug, I am not certain what the delay is on a centos 7 -830 kernel.

Since 2017 I am running the special debug version of whatever is the current kernel that you can choose to manually install which has extra code compiled into the image will is supposed to help you to identify problems like this deadlock, but in this case the special debug code seems to mask the problem so my machine essentially never crashes as long as the debug kernel is running which is an acceptable workaround for my purposes. If I go back to the normal kernel I will crash within hours. I'm not clear if anyone other than me has tried this, but for me it has taken the urgency off the resolution of this bug.
justin2net

justin2net

2018-03-06 06:21

reporter   ~0031363

@thompsop

Thanks for the info! I have installed 3.10.0-693.17.1-el7.x86_64.debug. Let's hope it masks the problem like in your case. Looking at this, https://access.redhat.com/solutions/44983 I don't know which option would mask it?

I'm actually on RHEL 7 (self-support dev license) and -830 is available there, so I have tried it, but with no resolution of the hanging...
alphacc

alphacc

2018-03-06 10:29

developer   ~0031365

@justin2net can you get a reproducer ? do you manage to trigger it ?
I am trying to get a reproducer but no luck yet. It can take several days to show-up.
thompsop

thompsop

2018-03-06 17:45

reporter   ~0031370

I updated notes in the Redhat bug about -830 not helping and also seeming to miss the relevant fix.

@justin2net
If you have a Redhat dev license you should able to get a login to bugzilla to have more access to some of the details of the case there. It's a different login than the access.redhat.com site apparently but still should be accessible to you.
toracat

toracat

2018-03-06 17:52

manager   ~0031371

Most kernel-related bugs get marked 'private' automatically. Only those in the CC list can access them. If you are the submitter, you should be able to add email addresses to the list for those who want to have access.
justin2net

justin2net

2018-03-08 06:37

reporter   ~0031378

@alphacc

No, I have not been able to come up with a recipe that reproduces the bug. It almost always invariably occurs after a long period of idle, though.
thompsop

thompsop

2018-03-11 00:18

reporter   ~0031407

w > sysrq output from 3.10.0-693.21.1.el7.x86_64
vmcore from same
thompsop

thompsop

2018-03-11 14:14

reporter   ~0031408

transcribed from other bug

--- Comment #19 from Eric Sandeen <esandeen@redhat.com> ---

-830 is not missing the relevant fix. The patchset fixed several bugs and not
all were listed in the changelog, but -830 does indeed have the md patchset.

The report in the centos bug which says "830 didn't fix it for me" does not
contain nearly enough information to know whether it is the same underlying
problem; that reporter provided no dmesg, traces, logs etc to know what they
hit.

When you have a chance to try the -830 kernel for yourself, with md in the mix,
feel free to re-open this bug if it still fails for you.
alphacc

alphacc

2018-03-12 09:56

developer   ~0031410

I have a support contract, so I opened a Red Hat case with all the details.
I'll update this bug with info I can share when I have updates from Red Hat.

As thompsop said the above mentionned patches are part of -830.
We are still trying to get a reproducer.
thompsop

thompsop

2018-03-12 16:37

reporter   ~0031417

I fumbled around for a while on the redhat site and was not able to locate -830 kernel accessible to me, although I do have the free developer logs to redhat.com. If someone can point me to where I can get the bits and pieces for a redhat x86_64 -830 kernel .rpms and it is as easy as rpm/installing it on my centos7 system, I have no objection to trying to run the kernel on my machine.
justin2net

justin2net

2018-03-14 21:57

reporter   ~0031443

@thompsop

Did you mean to attach the output of w > sysrq and vmcore in your previous message?

I've been running 3.10.0-693.17.1.el.x86_64.debug for the past 8 days without any hanging...yet. Thanks for the recommendation.

@alphacc

Please keep us up to date with the RH Bugzilla you opened. If you find a reproducer, I can try it on my system with a variety of kernel versions. Thanks.
sles

sles

2018-04-04 04:51

reporter   ~0031557

"Bug 1493585 is in the verified state and has already been fixed in
MD in kernel-3.10.0-820.el7."

This bug is not in md, I have this problem over VDO device.
diito

diito

2018-04-04 21:37

reporter   ~0031563

This issue exists for me on a non-mdraid SSD volume. I've not found any of the Centos kernel versions new or old have been helpful to mitigate it to date, they all crash after a few days tops. If I install the kernel-ml kernel things are far more stable and I've gone as long as a month without issue but it still will eventually crash with the same error too. This only seems to affect my CentOS 7 VM's running on a Fedora 27 host.
slayerduck

slayerduck

2018-04-25 12:23

reporter   ~0031677

Just got a brand new supermicro server with Xeon Silver 4114 CPU's, still have the exact same issue and now confirmed over 3 different generations of servers. All of my servers run SSD's in MDADM arrays. For my latest server i went to centos vault and downloaded 3.10.0-514.26.2.el7.x86_64 and that seems to fix it. Attached logs from my latest server. Haven't tested debug kernel yet, have you crashed ever since using it @justin2net?

yuuno.txt (26,373 bytes)
Apr 25 01:01:47 yuuno kernel: INFO: task kworker/u898:2:16685 blocked for more than 120 seconds.
Apr 25 01:01:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:01:47 yuuno kernel: kworker/u898:2  D ffff88203a9f0000     0 16685      2 0x00000080
Apr 25 01:01:47 yuuno kernel: Workqueue: writeback bdi_writeback_workfn (flush-9:0)
Apr 25 01:01:47 yuuno kernel: Call Trace:
Apr 25 01:01:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:01:47 yuuno kernel: [<ffffffff8151df05>] bitmap_startwrite+0x1f5/0x210
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b3690>] ? wake_up_atomic_t+0x30/0x30
Apr 25 01:01:47 yuuno kernel: [<ffffffffc05482c1>] add_stripe_bio+0x3f1/0x770 [raid456]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc054c701>] raid5_make_request+0x211/0xce0 [raid456]
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b3690>] ? wake_up_atomic_t+0x30/0x30
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0249487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff8150c644>] md_make_request+0x104/0x290
Apr 25 01:01:47 yuuno kernel: [<ffffffff812fb015>] generic_make_request+0x105/0x310
Apr 25 01:01:47 yuuno kernel: [<ffffffff812fb290>] submit_bio+0x70/0x150
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0222661>] xfs_submit_ioend.isra.16+0x61/0xe0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0222821>] xfs_vm_writepages+0xd1/0xe0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff81190efe>] do_writepages+0x1e/0x40
Apr 25 01:01:47 yuuno kernel: [<ffffffff8122fa80>] __writeback_single_inode+0x40/0x220
Apr 25 01:01:47 yuuno kernel: [<ffffffff812306c4>] writeback_sb_inodes+0x1c4/0x490
Apr 25 01:01:47 yuuno kernel: [<ffffffff81230a2f>] __writeback_inodes_wb+0x9f/0xd0
Apr 25 01:01:47 yuuno kernel: [<ffffffff81231263>] wb_writeback+0x263/0x2f0
Apr 25 01:01:47 yuuno kernel: [<ffffffff8123170b>] bdi_writeback_workfn+0x2cb/0x460
Apr 25 01:01:47 yuuno kernel: [<ffffffff810aa59a>] process_one_work+0x17a/0x440
Apr 25 01:01:47 yuuno kernel: [<ffffffff810ab266>] worker_thread+0x126/0x3c0
Apr 25 01:01:47 yuuno kernel: [<ffffffff810ab140>] ? manage_workers.isra.24+0x2a0/0x2a0
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b270f>] kthread+0xcf/0xe0
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
Apr 25 01:01:47 yuuno kernel: [<ffffffff816b8798>] ret_from_fork+0x58/0x90
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
Apr 25 01:01:47 yuuno kernel: INFO: task rsync:16810 blocked for more than 120 seconds.
Apr 25 01:01:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:01:47 yuuno kernel: rsync           D ffff881fd135dee0     0 16810  16807 0x00000080
Apr 25 01:01:47 yuuno kernel: Call Trace:
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0249487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:01:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02599f8>] ? _xfs_trans_bjoin+0x48/0x60 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc021c4a7>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff816aab47>] __down_common+0xaa/0x104
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02294a0>] ? _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff816aabbe>] __down+0x1d/0x1f
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b8411>] down+0x41/0x50
Apr 25 01:01:47 yuuno kernel: [<ffffffffc022929c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02294a0>] _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc022968a>] xfs_buf_get_map+0x2a/0x240 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc022a230>] xfs_buf_read_map+0x30/0x160 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0259ef1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0208b74>] xfs_da_read_buf+0xd4/0x100 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc020e486>] xfs_dir3_data_read+0x26/0x60 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02130e4>] xfs_dir2_node_addname_int+0x454/0x820 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc020a096>] ? xfs_da3_node_lookup_int+0x2c6/0x2f0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0214774>] xfs_dir2_node_addname+0xd4/0x160 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc020bf55>] xfs_dir_createname+0x1d5/0x1f0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc023d4fd>] xfs_create+0x58d/0x6c0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc023a25b>] xfs_vn_mknod+0xbb/0x240 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc023a413>] xfs_vn_create+0x13/0x20 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff8120f6cd>] vfs_create+0xcd/0x130
Apr 25 01:01:47 yuuno kernel: [<ffffffff8121285a>] do_last+0x10ea/0x12c0
Apr 25 01:01:47 yuuno kernel: [<ffffffff81212af2>] path_openat+0xc2/0x490
Apr 25 01:01:47 yuuno kernel: [<ffffffff8121508b>] do_filp_open+0x4b/0xb0
Apr 25 01:01:47 yuuno kernel: [<ffffffff8122233a>] ? __alloc_fd+0x8a/0x130
Apr 25 01:01:47 yuuno kernel: [<ffffffff81201bc3>] do_sys_open+0xf3/0x1f0
Apr 25 01:01:47 yuuno kernel: [<ffffffff81201cde>] SyS_open+0x1e/0x20
Apr 25 01:01:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
Apr 25 01:01:47 yuuno kernel: INFO: task rsync:16937 blocked for more than 120 seconds.
Apr 25 01:01:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:01:47 yuuno kernel: rsync           D ffff88103a201fa0     0 16937  16934 0x00000080
Apr 25 01:01:47 yuuno kernel: Call Trace:
Apr 25 01:01:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:01:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:01:47 yuuno kernel: [<ffffffffc021c4a7>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff816aab47>] __down_common+0xaa/0x104
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02294a0>] ? _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff816aabbe>] __down+0x1d/0x1f
Apr 25 01:01:47 yuuno kernel: [<ffffffff810b8411>] down+0x41/0x50
Apr 25 01:01:47 yuuno kernel: [<ffffffffc022929c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02294a0>] _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc022968a>] xfs_buf_get_map+0x2a/0x240 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc022a230>] xfs_buf_read_map+0x30/0x160 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0259ef1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0208b74>] xfs_da_read_buf+0xd4/0x100 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0208bc3>] xfs_da3_node_read+0x23/0xd0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0209e3e>] xfs_da3_node_lookup_int+0x6e/0x2f0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc02146ea>] xfs_dir2_node_addname+0x4a/0x160 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc020bf55>] xfs_dir_createname+0x1d5/0x1f0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffffc023eff7>] xfs_rename+0x6f7/0x900 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff8120e448>] ? inode_permission+0x18/0x50
Apr 25 01:01:47 yuuno kernel: [<ffffffff81210311>] ? link_path_walk+0x81/0x8b0
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0238ea0>] xfs_vn_rename+0xb0/0xf0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff81213890>] vfs_rename+0x5a0/0x930
Apr 25 01:01:47 yuuno kernel: [<ffffffffc0238df0>] ? xfs_vn_follow_link+0xa0/0xa0 [xfs]
Apr 25 01:01:47 yuuno kernel: [<ffffffff81214b93>] SYSC_renameat2+0x503/0x5a0
Apr 25 01:01:47 yuuno kernel: [<ffffffff81205b91>] ? __sb_end_write+0x31/0x60
Apr 25 01:01:47 yuuno kernel: [<ffffffff812230df>] ? mnt_drop_write+0x1f/0x30
Apr 25 01:01:47 yuuno kernel: [<ffffffff812249f4>] ? mntput+0x24/0x40
Apr 25 01:01:47 yuuno kernel: [<ffffffff8120d57e>] ? path_put+0x1e/0x30
Apr 25 01:01:47 yuuno kernel: [<ffffffff81215a7e>] SyS_renameat2+0xe/0x10
Apr 25 01:01:47 yuuno kernel: [<ffffffff81215abe>] SyS_rename+0x1e/0x20
Apr 25 01:01:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
Apr 25 01:03:47 yuuno kernel: INFO: task kworker/u898:2:16685 blocked for more than 120 seconds.
Apr 25 01:03:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:03:47 yuuno kernel: kworker/u898:2  D ffff88203a9f0000     0 16685      2 0x00000080
Apr 25 01:03:47 yuuno kernel: Workqueue: writeback bdi_writeback_workfn (flush-9:0)
Apr 25 01:03:47 yuuno kernel: Call Trace:
Apr 25 01:03:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:03:47 yuuno kernel: [<ffffffff8151df05>] bitmap_startwrite+0x1f5/0x210
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b3690>] ? wake_up_atomic_t+0x30/0x30
Apr 25 01:03:47 yuuno kernel: [<ffffffffc05482c1>] add_stripe_bio+0x3f1/0x770 [raid456]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc054c701>] raid5_make_request+0x211/0xce0 [raid456]
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b3690>] ? wake_up_atomic_t+0x30/0x30
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0249487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff8150c644>] md_make_request+0x104/0x290
Apr 25 01:03:47 yuuno kernel: [<ffffffff812fb015>] generic_make_request+0x105/0x310
Apr 25 01:03:47 yuuno kernel: [<ffffffff812fb290>] submit_bio+0x70/0x150
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0222661>] xfs_submit_ioend.isra.16+0x61/0xe0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0222821>] xfs_vm_writepages+0xd1/0xe0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff81190efe>] do_writepages+0x1e/0x40
Apr 25 01:03:47 yuuno kernel: [<ffffffff8122fa80>] __writeback_single_inode+0x40/0x220
Apr 25 01:03:47 yuuno kernel: [<ffffffff812306c4>] writeback_sb_inodes+0x1c4/0x490
Apr 25 01:03:47 yuuno kernel: [<ffffffff81230a2f>] __writeback_inodes_wb+0x9f/0xd0
Apr 25 01:03:47 yuuno kernel: [<ffffffff81231263>] wb_writeback+0x263/0x2f0
Apr 25 01:03:47 yuuno kernel: [<ffffffff8123170b>] bdi_writeback_workfn+0x2cb/0x460
Apr 25 01:03:47 yuuno kernel: [<ffffffff810aa59a>] process_one_work+0x17a/0x440
Apr 25 01:03:47 yuuno kernel: [<ffffffff810ab266>] worker_thread+0x126/0x3c0
Apr 25 01:03:47 yuuno kernel: [<ffffffff810ab140>] ? manage_workers.isra.24+0x2a0/0x2a0
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b270f>] kthread+0xcf/0xe0
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
Apr 25 01:03:47 yuuno kernel: [<ffffffff816b8798>] ret_from_fork+0x58/0x90
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
Apr 25 01:03:47 yuuno kernel: INFO: task rsync:16810 blocked for more than 120 seconds.
Apr 25 01:03:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:03:47 yuuno kernel: rsync           D ffff881fd135dee0     0 16810  16807 0x00000080
Apr 25 01:03:47 yuuno kernel: Call Trace:
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0249487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:03:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02599f8>] ? _xfs_trans_bjoin+0x48/0x60 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc021c4a7>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff816aab47>] __down_common+0xaa/0x104
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02294a0>] ? _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff816aabbe>] __down+0x1d/0x1f
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b8411>] down+0x41/0x50
Apr 25 01:03:47 yuuno kernel: [<ffffffffc022929c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02294a0>] _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc022968a>] xfs_buf_get_map+0x2a/0x240 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc022a230>] xfs_buf_read_map+0x30/0x160 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0259ef1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0208b74>] xfs_da_read_buf+0xd4/0x100 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc020e486>] xfs_dir3_data_read+0x26/0x60 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02130e4>] xfs_dir2_node_addname_int+0x454/0x820 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc020a096>] ? xfs_da3_node_lookup_int+0x2c6/0x2f0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0214774>] xfs_dir2_node_addname+0xd4/0x160 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc020bf55>] xfs_dir_createname+0x1d5/0x1f0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc023d4fd>] xfs_create+0x58d/0x6c0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc023a25b>] xfs_vn_mknod+0xbb/0x240 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc023a413>] xfs_vn_create+0x13/0x20 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff8120f6cd>] vfs_create+0xcd/0x130
Apr 25 01:03:47 yuuno kernel: [<ffffffff8121285a>] do_last+0x10ea/0x12c0
Apr 25 01:03:47 yuuno kernel: [<ffffffff81212af2>] path_openat+0xc2/0x490
Apr 25 01:03:47 yuuno kernel: [<ffffffff8121508b>] do_filp_open+0x4b/0xb0
Apr 25 01:03:47 yuuno kernel: [<ffffffff8122233a>] ? __alloc_fd+0x8a/0x130
Apr 25 01:03:47 yuuno kernel: [<ffffffff81201bc3>] do_sys_open+0xf3/0x1f0
Apr 25 01:03:47 yuuno kernel: [<ffffffff81201cde>] SyS_open+0x1e/0x20
Apr 25 01:03:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
Apr 25 01:03:47 yuuno kernel: INFO: task rsync:16937 blocked for more than 120 seconds.
Apr 25 01:03:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:03:47 yuuno kernel: rsync           D ffff88103a201fa0     0 16937  16934 0x00000080
Apr 25 01:03:47 yuuno kernel: Call Trace:
Apr 25 01:03:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:03:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:03:47 yuuno kernel: [<ffffffffc021c4a7>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff816aab47>] __down_common+0xaa/0x104
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02294a0>] ? _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff816aabbe>] __down+0x1d/0x1f
Apr 25 01:03:47 yuuno kernel: [<ffffffff810b8411>] down+0x41/0x50
Apr 25 01:03:47 yuuno kernel: [<ffffffffc022929c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02294a0>] _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc022968a>] xfs_buf_get_map+0x2a/0x240 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc022a230>] xfs_buf_read_map+0x30/0x160 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0259ef1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0208b74>] xfs_da_read_buf+0xd4/0x100 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0208bc3>] xfs_da3_node_read+0x23/0xd0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0209e3e>] xfs_da3_node_lookup_int+0x6e/0x2f0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc02146ea>] xfs_dir2_node_addname+0x4a/0x160 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc020bf55>] xfs_dir_createname+0x1d5/0x1f0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffffc023eff7>] xfs_rename+0x6f7/0x900 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff8120e448>] ? inode_permission+0x18/0x50
Apr 25 01:03:47 yuuno kernel: [<ffffffff81210311>] ? link_path_walk+0x81/0x8b0
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0238ea0>] xfs_vn_rename+0xb0/0xf0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff81213890>] vfs_rename+0x5a0/0x930
Apr 25 01:03:47 yuuno kernel: [<ffffffffc0238df0>] ? xfs_vn_follow_link+0xa0/0xa0 [xfs]
Apr 25 01:03:47 yuuno kernel: [<ffffffff81214b93>] SYSC_renameat2+0x503/0x5a0
Apr 25 01:03:47 yuuno kernel: [<ffffffff81205b91>] ? __sb_end_write+0x31/0x60
Apr 25 01:03:47 yuuno kernel: [<ffffffff812230df>] ? mnt_drop_write+0x1f/0x30
Apr 25 01:03:47 yuuno kernel: [<ffffffff812249f4>] ? mntput+0x24/0x40
Apr 25 01:03:47 yuuno kernel: [<ffffffff8120d57e>] ? path_put+0x1e/0x30
Apr 25 01:03:47 yuuno kernel: [<ffffffff81215a7e>] SyS_renameat2+0xe/0x10
Apr 25 01:03:47 yuuno kernel: [<ffffffff81215abe>] SyS_rename+0x1e/0x20
Apr 25 01:03:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
Apr 25 01:03:47 yuuno kernel: INFO: task xfs_db:17136 blocked for more than 120 seconds.
Apr 25 01:03:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:03:47 yuuno kernel: xfs_db          D ffff88101aa65ee0     0 17136  17135 0x00000080
Apr 25 01:03:47 yuuno kernel: Call Trace:
Apr 25 01:03:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:03:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:03:47 yuuno kernel: [<ffffffff810a847c>] ? __queue_work+0x13c/0x3c0
Apr 25 01:03:47 yuuno kernel: [<ffffffff810a87ca>] ? __queue_delayed_work+0xaa/0x1a0
Apr 25 01:03:47 yuuno kernel: [<ffffffff810a8b31>] ? try_to_grab_pending+0xb1/0x160
Apr 25 01:03:47 yuuno kernel: [<ffffffff816abc5d>] wait_for_completion+0xfd/0x140
Apr 25 01:03:47 yuuno kernel: [<ffffffff810c6620>] ? wake_up_state+0x20/0x20
Apr 25 01:03:47 yuuno kernel: [<ffffffff8122fdf0>] writeback_inodes_sb_nr+0x90/0xd0
Apr 25 01:03:47 yuuno kernel: [<ffffffff8122fe55>] writeback_inodes_sb+0x25/0x30
Apr 25 01:03:47 yuuno kernel: [<ffffffff81235805>] sync_filesystem+0x35/0xb0
Apr 25 01:03:47 yuuno kernel: [<ffffffff812407f4>] fsync_bdev+0x24/0x60
Apr 25 01:03:47 yuuno kernel: [<ffffffff8130aa58>] blkdev_ioctl+0x6a8/0x980
Apr 25 01:03:47 yuuno kernel: [<ffffffff81187357>] ? mempool_free_slab+0x17/0x20
Apr 25 01:03:47 yuuno kernel: [<ffffffff8123fc91>] block_ioctl+0x41/0x50
Apr 25 01:03:47 yuuno kernel: [<ffffffff8121730d>] do_vfs_ioctl+0x33d/0x540
Apr 25 01:03:47 yuuno kernel: [<ffffffff812fe72a>] ? blkdev_issue_flush+0xda/0x110
Apr 25 01:03:47 yuuno kernel: [<ffffffff812175b1>] SyS_ioctl+0xa1/0xc0
Apr 25 01:03:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
Apr 25 01:05:47 yuuno kernel: INFO: task kworker/u898:2:16685 blocked for more than 120 seconds.
Apr 25 01:05:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:05:47 yuuno kernel: kworker/u898:2  D ffff88203a9f0000     0 16685      2 0x00000080
Apr 25 01:05:47 yuuno kernel: Workqueue: writeback bdi_writeback_workfn (flush-9:0)
Apr 25 01:05:47 yuuno kernel: Call Trace:
Apr 25 01:05:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:05:47 yuuno kernel: [<ffffffff8151df05>] bitmap_startwrite+0x1f5/0x210
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b3690>] ? wake_up_atomic_t+0x30/0x30
Apr 25 01:05:47 yuuno kernel: [<ffffffffc05482c1>] add_stripe_bio+0x3f1/0x770 [raid456]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc054c701>] raid5_make_request+0x211/0xce0 [raid456]
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b3690>] ? wake_up_atomic_t+0x30/0x30
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0249487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff8150c644>] md_make_request+0x104/0x290
Apr 25 01:05:47 yuuno kernel: [<ffffffff812fb015>] generic_make_request+0x105/0x310
Apr 25 01:05:47 yuuno kernel: [<ffffffff812fb290>] submit_bio+0x70/0x150
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0222661>] xfs_submit_ioend.isra.16+0x61/0xe0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0222821>] xfs_vm_writepages+0xd1/0xe0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff81190efe>] do_writepages+0x1e/0x40
Apr 25 01:05:47 yuuno kernel: [<ffffffff8122fa80>] __writeback_single_inode+0x40/0x220
Apr 25 01:05:47 yuuno kernel: [<ffffffff812306c4>] writeback_sb_inodes+0x1c4/0x490
Apr 25 01:05:47 yuuno kernel: [<ffffffff81230a2f>] __writeback_inodes_wb+0x9f/0xd0
Apr 25 01:05:47 yuuno kernel: [<ffffffff81231263>] wb_writeback+0x263/0x2f0
Apr 25 01:05:47 yuuno kernel: [<ffffffff8123170b>] bdi_writeback_workfn+0x2cb/0x460
Apr 25 01:05:47 yuuno kernel: [<ffffffff810aa59a>] process_one_work+0x17a/0x440
Apr 25 01:05:47 yuuno kernel: [<ffffffff810ab266>] worker_thread+0x126/0x3c0
Apr 25 01:05:47 yuuno kernel: [<ffffffff810ab140>] ? manage_workers.isra.24+0x2a0/0x2a0
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b270f>] kthread+0xcf/0xe0
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
Apr 25 01:05:47 yuuno kernel: [<ffffffff816b8798>] ret_from_fork+0x58/0x90
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b2640>] ? insert_kthread_work+0x40/0x40
Apr 25 01:05:47 yuuno kernel: INFO: task rsync:16810 blocked for more than 120 seconds.
Apr 25 01:05:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:05:47 yuuno kernel: rsync           D ffff881fd135dee0     0 16810  16807 0x00000080
Apr 25 01:05:47 yuuno kernel: Call Trace:
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0249487>] ? kmem_zone_alloc+0x97/0x130 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:05:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02599f8>] ? _xfs_trans_bjoin+0x48/0x60 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc021c4a7>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff816aab47>] __down_common+0xaa/0x104
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02294a0>] ? _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff816aabbe>] __down+0x1d/0x1f
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b8411>] down+0x41/0x50
Apr 25 01:05:47 yuuno kernel: [<ffffffffc022929c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02294a0>] _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc022968a>] xfs_buf_get_map+0x2a/0x240 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc022a230>] xfs_buf_read_map+0x30/0x160 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0259ef1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0208b74>] xfs_da_read_buf+0xd4/0x100 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc020e486>] xfs_dir3_data_read+0x26/0x60 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02130e4>] xfs_dir2_node_addname_int+0x454/0x820 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc020a096>] ? xfs_da3_node_lookup_int+0x2c6/0x2f0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0214774>] xfs_dir2_node_addname+0xd4/0x160 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc020bf55>] xfs_dir_createname+0x1d5/0x1f0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc023d4fd>] xfs_create+0x58d/0x6c0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc023a25b>] xfs_vn_mknod+0xbb/0x240 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc023a413>] xfs_vn_create+0x13/0x20 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff8120f6cd>] vfs_create+0xcd/0x130
Apr 25 01:05:47 yuuno kernel: [<ffffffff8121285a>] do_last+0x10ea/0x12c0
Apr 25 01:05:47 yuuno kernel: [<ffffffff81212af2>] path_openat+0xc2/0x490
Apr 25 01:05:47 yuuno kernel: [<ffffffff8121508b>] do_filp_open+0x4b/0xb0
Apr 25 01:05:47 yuuno kernel: [<ffffffff8122233a>] ? __alloc_fd+0x8a/0x130
Apr 25 01:05:47 yuuno kernel: [<ffffffff81201bc3>] do_sys_open+0xf3/0x1f0
Apr 25 01:05:47 yuuno kernel: [<ffffffff81201cde>] SyS_open+0x1e/0x20
Apr 25 01:05:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
Apr 25 01:05:47 yuuno kernel: INFO: task rsync:16937 blocked for more than 120 seconds.
Apr 25 01:05:47 yuuno kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr 25 01:05:47 yuuno kernel: rsync           D ffff88103a201fa0     0 16937  16934 0x00000080
Apr 25 01:05:47 yuuno kernel: Call Trace:
Apr 25 01:05:47 yuuno kernel: [<ffffffff816ab8a9>] schedule+0x29/0x70
Apr 25 01:05:47 yuuno kernel: [<ffffffff816a92b9>] schedule_timeout+0x239/0x2c0
Apr 25 01:05:47 yuuno kernel: [<ffffffffc021c4a7>] ? xfs_iext_bno_to_ext+0xa7/0x1a0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff816aab47>] __down_common+0xaa/0x104
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02294a0>] ? _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff816aabbe>] __down+0x1d/0x1f
Apr 25 01:05:47 yuuno kernel: [<ffffffff810b8411>] down+0x41/0x50
Apr 25 01:05:47 yuuno kernel: [<ffffffffc022929c>] xfs_buf_lock+0x3c/0xd0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02294a0>] _xfs_buf_find+0x170/0x330 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc022968a>] xfs_buf_get_map+0x2a/0x240 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc022a230>] xfs_buf_read_map+0x30/0x160 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0259ef1>] xfs_trans_read_buf_map+0x211/0x400 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0208b74>] xfs_da_read_buf+0xd4/0x100 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0208bc3>] xfs_da3_node_read+0x23/0xd0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0209e3e>] xfs_da3_node_lookup_int+0x6e/0x2f0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc02146ea>] xfs_dir2_node_addname+0x4a/0x160 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc020bf55>] xfs_dir_createname+0x1d5/0x1f0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffffc023eff7>] xfs_rename+0x6f7/0x900 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff8120e448>] ? inode_permission+0x18/0x50
Apr 25 01:05:47 yuuno kernel: [<ffffffff81210311>] ? link_path_walk+0x81/0x8b0
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0238ea0>] xfs_vn_rename+0xb0/0xf0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff81213890>] vfs_rename+0x5a0/0x930
Apr 25 01:05:47 yuuno kernel: [<ffffffffc0238df0>] ? xfs_vn_follow_link+0xa0/0xa0 [xfs]
Apr 25 01:05:47 yuuno kernel: [<ffffffff81214b93>] SYSC_renameat2+0x503/0x5a0
Apr 25 01:05:47 yuuno kernel: [<ffffffff81205b91>] ? __sb_end_write+0x31/0x60
Apr 25 01:05:47 yuuno kernel: [<ffffffff812230df>] ? mnt_drop_write+0x1f/0x30
Apr 25 01:05:47 yuuno kernel: [<ffffffff812249f4>] ? mntput+0x24/0x40
Apr 25 01:05:47 yuuno kernel: [<ffffffff8120d57e>] ? path_put+0x1e/0x30
Apr 25 01:05:47 yuuno kernel: [<ffffffff81215a7e>] SyS_renameat2+0xe/0x10
Apr 25 01:05:47 yuuno kernel: [<ffffffff81215abe>] SyS_rename+0x1e/0x20
Apr 25 01:05:47 yuuno kernel: [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b
yuuno.txt (26,373 bytes)
Yoh

Yoh

2018-05-05 20:04

reporter   ~0031720

Hello.

I ran into the same problem on the same server. I have two servers, one works correctly, another hangs in a few days.
On the server where the following file system information hangs:
[~]# xfs_info /
meta-data=/dev/md125 isize=256 agcount=4, agsize=1310720 blks
         = sectsz=512 attr=2, projid32bit=1
         = crc=0 finobt=0 spinodes=0
data = bsize=4096 blocks=5242880, imaxpct=25
         = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=2560, version=2
         = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

A server that does not hang has the following information:
[~]# xfs_info /
meta-data=/dev/md125 isize=512 agcount=4, agsize=1310720 blks
         = sectsz=512 attr=2, projid32bit=1
         = crc=1 finobt=0 spinodes=0
data = bsize=4096 blocks=5242880, imaxpct=25
         = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal bsize=4096 blocks=2560, version=2
         = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0

Maybe there is a problem in isize or crc? Can someone check the information at home?
thompsop

thompsop

2018-05-10 00:52

reporter   ~0031738

1) 3.10.0-862.el7.x86_64 has dropped for Centos

2) xfs_info /

meta-data=/dev/mapper/centos-root isize=256 agcount=4, agsize=3276800 blks
         = sectsz=512 attr=2, projid32bit=1
         = crc=0 finobt=0 spinodes=0
data = bsize=4096 blocks=13107200, imaxpct=25
         = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=0
log =internal bsize=4096 blocks=6400, version=2
         = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
slayerduck

slayerduck

2018-05-10 15:06

reporter   ~0031751

Thanks for the update thompsop, let us know if the new kernel fixes the problem for you. I'll leave my production servers on the old kernel until its verified to be resolved. I currently don't have any development servers on hand that i can test it with.
thompsop

thompsop

2018-05-29 18:39

reporter   ~0031945

No further issues 3.10.0-862.3.2.el7.x86_64
slayerduck

slayerduck

2018-09-15 13:26

reporter   ~0032724

Late update, but i also stopped experiencing any issues on 3.10.0-862.3.2.el7.x86_64. This bug report can be closed now i think?
toracat

toracat

2018-09-15 17:32

manager   ~0032726

Closing as 'resolved' as per the reports.

Issue History

Date Modified Username Field Change
2017-09-18 14:19 thompsop New Issue
2017-10-18 11:34 thompsop Note Added: 0030400
2017-10-25 21:44 thompsop Note Added: 0030468
2017-12-03 10:21 Neil Mukerji Note Added: 0030693
2017-12-03 18:29 thompsop Note Added: 0030694
2017-12-15 21:49 gregb Note Added: 0030761
2018-01-05 14:14 smuku@salesforce.com Note Added: 0030855
2018-01-05 15:13 gbeeley Note Added: 0030856
2018-01-05 16:08 thompsop Tag Attached: xfs
2018-01-05 16:09 thompsop Tag Attached: upstream
2018-01-05 16:10 thompsop Tag Attached: 7.4
2018-01-05 16:10 thompsop Tag Attached: file system
2018-01-22 12:11 joerg Note Added: 0030981
2018-01-28 16:05 thompsop Note Added: 0031076
2018-01-28 17:00 TrevorH Note Added: 0031077
2018-01-29 14:54 thompsop Note Added: 0031084
2018-01-29 15:20 joerg Note Added: 0031087
2018-01-29 18:08 diito Note Added: 0031091
2018-01-30 02:43 thompsop Note Added: 0031100
2018-01-30 02:44 thompsop Note Added: 0031101
2018-01-30 13:54 thompsop Note Added: 0031109
2018-01-30 15:42 thompsop Note Added: 0031110
2018-01-30 16:15 TrevorH Note Added: 0031111
2018-01-30 16:36 thompsop Note Added: 0031113
2018-01-30 18:06 diito Note Added: 0031119
2018-01-30 19:55 thompsop Note Added: 0031121
2018-01-30 19:55 joerg Note Added: 0031122
2018-01-30 20:07 thompsop Note Added: 0031123
2018-01-31 12:05 slayerduck Note Added: 0031129
2018-01-31 12:26 slayerduck Note Added: 0031130
2018-02-23 02:44 justin2net Note Added: 0031294
2018-02-23 02:45 justin2net Note Added: 0031295
2018-02-23 10:06 alphacc Note Added: 0031299
2018-02-27 12:49 thompsop Note Added: 0031334
2018-03-06 00:38 justin2net Note Added: 0031360
2018-03-06 02:00 thompsop Note Added: 0031361
2018-03-06 06:21 justin2net Note Added: 0031363
2018-03-06 10:29 alphacc Note Added: 0031365
2018-03-06 17:45 thompsop Note Added: 0031370
2018-03-06 17:52 toracat Note Added: 0031371
2018-03-08 06:37 justin2net Note Added: 0031378
2018-03-11 00:18 thompsop Note Added: 0031407
2018-03-11 14:14 thompsop Note Added: 0031408
2018-03-12 09:56 alphacc Note Added: 0031410
2018-03-12 16:37 thompsop Note Added: 0031417
2018-03-14 21:57 justin2net Note Added: 0031443
2018-04-04 04:51 sles Note Added: 0031557
2018-04-04 21:37 diito Note Added: 0031563
2018-04-25 12:23 slayerduck File Added: yuuno.txt
2018-04-25 12:23 slayerduck Note Added: 0031677
2018-05-05 20:04 Yoh Note Added: 0031720
2018-05-10 00:52 thompsop Note Added: 0031738
2018-05-10 15:06 slayerduck Note Added: 0031751
2018-05-29 18:39 thompsop Note Added: 0031945
2018-09-15 13:26 slayerduck Note Added: 0032724
2018-09-15 17:32 toracat Status new => resolved
2018-09-15 17:32 toracat Resolution open => fixed
2018-09-15 17:32 toracat Note Added: 0032726