2017-12-11 13:11 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0014099CentOS-6e2fsprogspublic2017-11-08 21:32
Reporterpgraydon 
PrioritynormalSeveritycrashReproducibilityrandom
StatusnewResolutionopen 
Platformx86_64OSCentOSOS Version6.9
Product Version6.9 
Target VersionFixed in Version 
Summary0014099: mkfs.ext4 prone to crashing on large NVME drives
DescriptionWe are seeing issues with formatting large NVME partitions (typically > 512 GiB) on CentOS 6.9, the system freezes and stops to respond. mkfs.ext4 appears to get blocked by something and eventually we get a kernel panic

[root@centos-6 ~]# uname -r
2.6.32-642.13.1.el6.x86_64

[root@centos-6 ~]# modinfo /lib/modules/2.6.32-642.13.1.el6.x86_64/kernel/drivers/block/nvme.ko
filename: /lib/modules/2.6.32-642.13.1.el6.x86_64/kernel/drivers/block/nvme.ko
version: 0.10
license: GPL
author: Matthew Wilcox <willy@linux.intel.com>
srcversion: 38BF2C912186C6289DEF773
alias: pci:v*d*sv*sd*bc01sc08i02*
depends:
vermagic: 2.6.32-642.13.1.el6.x86_64 SMP mod_unload modversions
parm: admin_timeout:timeout in seconds for admin commands (byte)
parm: io_timeout:timeout in seconds for I/O (byte)
parm: retry_time:time in seconds to retry failed I/O (byte)
parm: shutdown_timeout:timeout in seconds for controller shutdown (byte)
parm: nvme_major:int
parm: nvme_char_major:int
parm: use_threaded_interrupts:int



CentOS release 6.8 (Final)
Kernel 2.6.32-642.13.1.el6.x86_64 on an x86_64

centos-6.9 login: nvme0n1: p1
INFO: task mkfs.ext4:4304 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.13.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mkfs.ext4 D 0000000000000006 0 4304 4278 0x00000080
 ffff880f2432bcc8 0000000000000086 00000000ffffffff 000000000f28a4d1
 000000003fffffff ffff880f2207af30 00000000000105b8 ffffffffa602dd0f
 000000003163c68d ffffffff81aa8340 ffff880f220265f8 ffff880f2432bfd8
Call Trace:
 [<ffffffff8112e3f0>] ? sync_page+0x0/0x50
 [<ffffffff81549213>] io_schedule+0x73/0xc0
 [<ffffffff8112e42d>] sync_page+0x3d/0x50
 [<ffffffff81549cff>] __wait_on_bit+0x5f/0x90
 [<ffffffff8112e663>] wait_on_page_bit+0x73/0x80
 [<ffffffff810a6920>] ? wake_bit_function+0x0/0x50
 [<ffffffff811447a5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff8112ea8b>] wait_on_page_writeback_range+0xfb/0x190
 [<ffffffff8112ec58>] filemap_write_and_wait_range+0x78/0x90
 [<ffffffff811cc96e>] vfs_fsync_range+0x7e/0x100
 [<ffffffff811cca5d>] vfs_fsync+0x1d/0x20
 [<ffffffff811cca9e>] do_fsync+0x3e/0x60
 [<ffffffff811ccaf0>] sys_fsync+0x10/0x20
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b
INFO: task mkfs.ext4:4304 blocked for more than 120 seconds.
      Not tainted 2.6.32-642.13.1.el6.x86_64 #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
mkfs.ext4 D 0000000000000006 0 4304 4278 0x00000080
 ffff880f2432bcc8 0000000000000086 00000000ffffffff 000000000f28a4d1
 000000003fffffff ffff880f2207af30 00000000000105b8 ffffffffa602dd0f
 000000003163c68d ffffffff81aa8340 ffff880f220265f8 ffff880f2432bfd8
Call Trace:
 [<ffffffff8112e3f0>] ? sync_page+0x0/0x50
 [<ffffffff81549213>] io_schedule+0x73/0xc0
 [<ffffffff8112e42d>] sync_page+0x3d/0x50
 [<ffffffff81549cff>] __wait_on_bit+0x5f/0x90
 [<ffffffff8112e663>] wait_on_page_bit+0x73/0x80
 [<ffffffff810a6920>] ? wake_bit_function+0x0/0x50
 [<ffffffff811447a5>] ? pagevec_lookup_tag+0x25/0x40
 [<ffffffff8112ea8b>] wait_on_page_writeback_range+0xfb/0x190
 [<ffffffff8112ec58>] filemap_write_and_wait_range+0x78/0x90
 [<ffffffff811cc96e>] vfs_fsync_range+0x7e/0x100
 [<ffffffff811cca5d>] vfs_fsync+0x1d/0x20
 [<ffffffff811cca9e>] do_fsync+0x3e/0x60
 [<ffffffff811ccaf0>] sys_fsync+0x10/0x20
 [<ffffffff8100b0d2>] system_call_fastpath+0x16/0x1b


Specifically We're seeing this against BM.DenseIO1.36 and BM.HighIO1.36 shapes (based on Oracle X5-2Cs), and our VM.DenseIO shapes. In the latter case, NVMe drives are being directly attached to a VM running under a hypervisor.

The NVMe drives in question are Samsung PM1725, but I don't believe that's particularly relevant.
Additional InformationWe only notice this with ext4.

We have tracked this down to the version of e2fsprogs. If we retrieve version 1.42.8-1.0.3.el6 from the Oracle Linux 6.9 repository, we're able to reliably format the drive, so it looks like there's a bug somewhere in e2fsprogs that has been fixed but needs backported?
TagsNo tags attached.
Attached Files

-Relationships
+Relationships

-Notes

~0030534

tru (administrator)

Your kernel is out of date and probably the rest of you installed packages need to be updated.
current version: kernel-2.6.32-696.13.2.el6.x86_64 and e2fsprogs-1.41.12-23.el6.x86_64 (maybe the fix is already rolled in).

~0030535

pgraydon (reporter)

Sorry. I copied the text from the first tests we did. We were also seeing it against 2.6.32-696.10.1.el6.x86_64 and e2fsprogs-1.41.12-23.el6.x86_64.

I've just tried upgrading to the latest kernel, 2.6.32-696.13.2.el6.x86_64, but still also see it.

~0030538

tru (administrator)

thanks for the update, please fill a RFE at bugzilla.redhat.com for RHEL6 asking for backport?

http://public-yum.oracle.com/repo/OracleLinux/OL6/9/base/x86_64/getPackageSource/e2fsprogs-1.42.8-1.0.3.el6.src.rpm

changelog:
* Tue Feb 28 2017 Lans Hung <lans.hung@oracle.com> 1.42.8-1.0.3.el6
- merge from branches-6/rhel6-u9/ (Revision 598)
- add pathces: [Orabug 25120]
      e2fsprogs-1.42.12-e2fsck-impossible-dirblocks.patch
      e2fsprogs-1.42.9-check_if_mounted-return.patch
- already applied patches:
      e2fsprogs-1.42.8-resize-flex_bg-no-resize_inode.patch

* Thu Oct 16 2014 Vaughan Cao <vaughan.cao@oracle.com> 1.42.8-1.0.2.el6
- add patches: [Orabug 19831079]
      e2fsprogs-1.41.12-e2fsck-blocks-past-eof-1.patch,
      e2fsprogs-1.41.12-tune2fs-remove-dirty-journal-1.patch,
      e2fsprogs-1.41.12-mke2fs-rev-check-1.patch,
      e2fsprogs-1.41.12-e2image-mounted-1.patch.
- Following bugs should be resolved.
  Disallow e2image on rw-mounted fs w/o force flag (#1097061)
  Allow tune2fs to remove a dirty journal with -ff (#1040122)
  Add enable_periodic_fsck to mke2fs.conf. default is no. (#1052409)
  Disallow too-high revision during mke2fs (#1093446)
  Fix e2fsck false positive when blocks exist past EOF (#994615)

* Wed Nov 06 2013 Vaughan Cao <vaughan.cao@oracle.com> 1.42.8-1.0.1.el6
- add e2fsprogs-1.41.12-resize2fs-fix-parents.patch [Orabug 17747231]

* Wed Aug 21 2013 Jingdong Lu <jingdong.lu@oracle.com> 1.42.8-1.el6
- update to 1.42.8-1
- add patch e2fsprogs-1.42.8-f_extent_oobounds.patch
- remove r_1024_small_bg test failing resize2fs tests

* Thu Jun 27 2013 Jingdong Lu <jingdong.lu@oracle.com> 1.42.7-1.el6
- update to 1.42.7-1.el6 and remove patches from redhat
- add files which need to be packaged

* Tue Jun 18 2013 Eric Sandeen <sandeen@redhat.com> 1.41.12-14.2
- Further enhance e2fsck detection of invalid extent trees (#974193)

~0030544

pgraydon (reporter)

I'll file a bug upstream, thank you for your help.

Just a quick addendum, while we've been told by users that this affects bare metal machines, I've been unable to replicate it, and we're now uncertain if it is or isn't. I have been able to replicate it very reliably with VMs.

With VMs the NVME drives are presented to the VM via direct PCI passthrough, so I wouldn't necessarily expect it to be particularly different but there's always a chance.
+Notes

-Issue History
Date Modified Username Field Change
2017-11-07 19:51 pgraydon New Issue
2017-11-07 23:34 tru Note Added: 0030534
2017-11-08 00:25 pgraydon Note Added: 0030535
2017-11-08 10:45 tru Note Added: 0030538
2017-11-08 21:32 pgraydon Note Added: 0030544
+Issue History