View Issue Details

IDProjectCategoryView StatusLast Update
0017623CentOS-7kernelpublic2020-07-27 13:50
Reporterpetrilak.m 
PriorityurgentSeverityblockReproducibilityalways
Status newResolutionopen 
Platformx86-64OSCentOSOS Version7.8.2003
Product Version7.8-2003 
Target VersionFixed in Version 
Summary0017623: xfs filesystem hung up
DescriptionI run script for process some files (about 23 thousands files). Program convert files to other format via temporary file (located in data directory). After each conversion temp file is deleted.
Temp file is deleted by rm command.
After some time, in some iteration, process rm hung up in state D+.

If temporary file is created in /tmp (ramdisk), there are no problem.
Normal disk is xfs filesystem on disk array connected via fibrechannel.

There is also problem in other operation other than deletion.
Some program for process data hungup, also.
State D+ stay for very long time (>24 hours).
iotop indicate all data transfers are zero.

A have physical computer with only one virtual machine (oVirt virtualization).
Data disk is connected to virtual machine as "direct LUN".

I run xfs_repair with some arguments for repair filesystem, but it does not help.

I think there is some problem in xfs filesystem driver.

If you need some logs or dumps, feel free to inform me.

iostat output:
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util
sdd 0.00 0.00 0.00 0.00 0.00 0.00 0.00 22.00 0.00 0.00 0.00 0.00 100.00

ps aufx:
fischer+ 4541 0.0 0.0 125976 2304 pts/1 Ss Jul24 0:00 | \_ bash
fischer+ 4617 0.0 0.0 129824 2136 pts/1 S+ Jul24 0:00 | | \_ /usr/bin/perl ./pl/run_domains.pl
fischer+ 4618 0.0 0.0 113292 1204 pts/1 S+ Jul24 0:00 | | \_ sh -c /mnt/data/DisALEXI/CZECH/code/Programs/preprocess_inputs.pl >& /mnt/data/DisALEXI/CZECH/domains/DryPan_2001-2020/messages/preprocess_inputs.ou
fischer+ 4620 0.0 0.0 129960 2360 pts/1 S+ Jul24 0:00 | | \_ /usr/bin/perl /mnt/data/DisALEXI/CZECH/code/Programs/preprocess_inputs.pl
fischer+ 77528 0.0 0.0 113292 1204 pts/1 S+ Jul26 0:00 | | \_ sh -c /mnt/data/DisALEXI/CZECH/code/SysPrograms/bin/modlai_daily_smooth.csh MCD15A3H 2000210 2020180 >& /mnt/data/DisALEXI/CZECH/domains/Dry
fischer+ 77529 0.0 0.0 137404 3064 pts/1 S+ Jul26 0:16 | | \_ /bin/csh -f /mnt/data/DisALEXI/CZECH/code/SysPrograms/bin/modlai_daily_smooth.csh MCD15A3H 2000210 2020180
fischer+ 114454 6.0 0.0 64660 58072 pts/1 D+ Jul26 138:44 | | \_ /mnt/data/DisALEXI/CZECH/code/SysPrograms/bin/modlai_daily_smooth.exe lai_input.2000210-2020180.txt


Below is dmesg log after program stuck in D+ state:
[133922.264189] INFO: task kworker/u480:2:67137 blocked for more than 120 seconds.
[133922.265249] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[133922.266034] kworker/u480:2 D ffff8b1201583150 0 67137 2 0x00000080
[133922.266770] Workqueue: writeback bdi_writeback_workfn (flush-253:2)
[133922.267526] Call Trace:
[133922.268275] [<ffffffffb1ce0628>] ? __enqueue_entity+0x78/0x80
[133922.269039] [<ffffffffb2385da9>] schedule+0x29/0x70
[133922.269780] [<ffffffffb23838b1>] schedule_timeout+0x221/0x2d0
[133922.270534] [<ffffffffb1ccc5dd>] ? down_trylock+0x2d/0x40
[133922.271308] [<ffffffffc086821f>] ? xfs_buf_trylock+0x1f/0xc0 [xfs]
[133922.272054] [<ffffffffb23851b7>] __down_common+0xaa/0x104
[133922.272791] [<ffffffffc0868500>] ? _xfs_buf_find.isra.9+0x170/0x330 [xfs]
[133922.273551] [<ffffffffb238522e>] __down+0x1d/0x1f
[133922.274297] [<ffffffffb1ccc631>] down+0x41/0x50
[133922.275053] [<ffffffffc08682fc>] xfs_buf_lock+0x3c/0xd0 [xfs]
[133922.275784] [<ffffffffc0868500>] _xfs_buf_find.isra.9+0x170/0x330 [xfs]
[133922.276551] [<ffffffffc0868755>] xfs_buf_get_map+0x35/0x250 [xfs]
[133922.277323] [<ffffffffc0868f20>] xfs_buf_read_map+0x30/0x160 [xfs]
[133922.278089] [<ffffffffc0899d49>] xfs_trans_read_buf_map+0xe9/0x2c0 [xfs]
[133922.278821] [<ffffffffc083e934>] xfs_btree_read_buf_block.constprop.33+0xa4/0xe0 [xfs]
[133922.279589] [<ffffffffc0842a45>] xfs_btree_lookup_get_block+0x95/0x1a0 [xfs]
[133922.280364] [<ffffffffc0842cff>] xfs_btree_lookup+0xdf/0x420 [xfs]
[133922.281130] [<ffffffffc083418f>] xfs_bmbt_lookup_eq+0x2f/0x40 [xfs]
[133922.281863] [<ffffffffc08371e4>] xfs_bmap_add_extent_delay_real+0x864/0x11e0 [xfs]
[133922.282628] [<ffffffffc0839624>] ? xfs_bmap_btalloc+0x2d4/0x7e0 [xfs]
[133922.283385] [<ffffffffb1e28852>] ? kmem_cache_alloc+0x1c2/0x1f0
[133922.284137] [<ffffffffc083b7c8>] xfs_bmapi_write+0x7f8/0xb70 [xfs]
[133922.284854] [<ffffffffc0888d67>] ? kmem_zone_alloc+0x97/0x130 [xfs]
[133922.285593] [<ffffffffc0877cf2>] xfs_iomap_write_allocate+0x182/0x380 [xfs]
[133922.286333] [<ffffffffc0861446>] xfs_map_blocks+0x1a6/0x220 [xfs]
[133922.287071] [<ffffffffc0862484>] xfs_do_writepage+0x174/0x550 [xfs]
[133922.287766] [<ffffffffb1dca1bc>] write_cache_pages+0x21c/0x470
[133922.288501] [<ffffffffc0862310>] ? xfs_vm_writepages+0xa0/0xa0 [xfs]
[133922.289232] [<ffffffffb1f546a0>] ? submit_bio+0x70/0x150
[133922.290707] [<ffffffffc08622db>] xfs_vm_writepages+0x6b/0xa0 [xfs]
[133922.291879] [<ffffffffb1dcb211>] do_writepages+0x21/0x50
[133922.292870] [<ffffffffb1e7cc30>] __writeback_single_inode+0x40/0x260
[133922.293816] [<ffffffffb1cc7745>] ? wake_up_bit+0x25/0x30
[133922.294726] [<ffffffffb1e7d7c4>] writeback_sb_inodes+0x1c4/0x430
[133922.295624] [<ffffffffb1e7dacf>] __writeback_inodes_wb+0x9f/0xd0
[133922.296642] [<ffffffffb1e7dfb3>] wb_writeback+0x263/0x2f0
[133922.297562] [<ffffffffb1e7eaac>] bdi_writeback_workfn+0x1cc/0x460
[133922.298441] [<ffffffffb1cbe6bf>] process_one_work+0x17f/0x440
[133922.299294] [<ffffffffb1cbf7d6>] worker_thread+0x126/0x3c0
[133922.300148] [<ffffffffb1cbf6b0>] ? manage_workers.isra.26+0x2a0/0x2a0
[133922.300967] [<ffffffffb1cc6691>] kthread+0xd1/0xe0
[133922.301757] [<ffffffffb1cc65c0>] ? insert_kthread_work+0x40/0x40
[133922.302615] [<ffffffffb2392d24>] ret_from_fork_nospec_begin+0xe/0x21
[133922.303442] [<ffffffffb1cc65c0>] ? insert_kthread_work+0x40/0x40
Steps To Reproduceafter each run of my program for compute. A have try to run it for 10-times and allways it hung up
Tagsxfs
abrt_hash
URL

Activities

TrevorH

TrevorH

2020-07-27 13:48

manager   ~0037419

What is your current kernel version as reported by `uname -a`
petrilak.m

petrilak.m

2020-07-27 13:50

reporter   ~0037420

Linux disalexi.local.czechglobe.cz 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Issue History

Date Modified Username Field Change
2020-07-27 13:46 petrilak.m New Issue
2020-07-27 13:46 petrilak.m Tag Attached: xfs
2020-07-27 13:48 TrevorH Note Added: 0037419
2020-07-27 13:50 petrilak.m Note Added: 0037420