| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | ||||
|---|---|---|---|---|---|---|---|---|---|
| 0004089 | CentOS-5 | kernel | public | 2009-12-22 22:36 | 2010-01-14 00:33 | ||||
| Reporter | antonl | ||||||||
| Priority | normal | Severity | block | Reproducibility | always | ||||
| Status | resolved | Resolution | fixed | ||||||
| Product Version | 5.4 | ||||||||
| Target Version | Fixed in Version | ||||||||
| Summary | 0004089: After update from 5.2 to 5.4 XFS modules crashes "on fly" causing mounted RAID to disappear | ||||||||
| Description | Configuration: 10xSATA units 1.5 TB each combined in RAID 5 (~ 15Tb). X86_84 version Because EXT3 doesn't support so large FS CentOS Plus kernel was used with enabled XFS. Yesterday after performing upgrade from 5.2 to 5.4 , consequently upgrading the kernel, I received complaints that people can no longer access the central storage. After reboot functionality would be restored for a few (4-6 hours) and then error "access denied" would return The reason is this: Dec 22 02:18:36 athena kernel: 00000000: 79 1c be c8 90 5f fd b9 69 92 e8 96 9d c7 50 76 y.ŸÈ._ý¹i.è..ÇPv Dec 22 02:18:36 athena kernel: Filesystem "md0": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffff88532826 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: Call Trace: Dec 22 02:18:36 athena kernel: [<ffffffff88532725>] :xfs:xfs_da_do_buf+0x503/0x5b1 Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88537b04>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec Dec 22 02:18:36 athena kernel: [<ffffffff88537b04>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec Dec 22 02:18:36 athena kernel: [<ffffffff88560d84>] :xfs:xfs_hack_filldir+0x0/0x5b Dec 22 02:18:36 athena kernel: [<ffffffff88560d84>] :xfs:xfs_hack_filldir+0x0/0x5b Dec 22 02:18:36 athena kernel: [<ffffffff88534860>] :xfs:xfs_readdir+0xa7/0xb6 Dec 22 02:18:36 athena kernel: [<ffffffff88561419>] :xfs:xfs_file_readdir+0xff/0x14c Dec 22 02:18:36 athena kernel: [<ffffffff80025689>] filldir+0x0/0xb7 Dec 22 02:18:36 athena kernel: [<ffffffff80025689>] filldir+0x0/0xb7 Dec 22 02:18:36 athena kernel: [<ffffffff8003527d>] vfs_readdir+0x77/0xa9 Dec 22 02:18:36 athena kernel: [<ffffffff80038b32>] sys_getdents+0x75/0xbd Dec 22 02:18:36 athena kernel: [<ffffffff8005d229>] tracesys+0x71/0xe0 Dec 22 02:18:36 athena kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: 00000000: 79 1c be c8 90 5f fd b9 69 92 e8 96 9d c7 50 76 y.ŸÈ._ý¹i.è..ÇPv Dec 22 02:18:36 athena kernel: Filesystem "md0": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffff88532826 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: Call Trace: Dec 22 02:18:36 athena kernel: [<ffffffff88532725>] :xfs:xfs_da_do_buf+0x503/0x5b1 Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88537b04>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec Dec 22 02:18:36 athena kernel: [<ffffffff88537b04>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec Dec 22 02:18:36 athena kernel: [<ffffffff88560d84>] :xfs:xfs_hack_filldir+0x0/0x5b Dec 22 02:18:36 athena kernel: [<ffffffff88560d84>] :xfs:xfs_hack_filldir+0x0/0x5b Dec 22 02:18:36 athena kernel: [<ffffffff88534860>] :xfs:xfs_readdir+0xa7/0xb6 Dec 22 02:18:36 athena kernel: [<ffffffff88561419>] :xfs:xfs_file_readdir+0xff/0x14c Dec 22 02:18:36 athena kernel: [<ffffffff80025689>] filldir+0x0/0xb7 Dec 22 02:18:36 athena kernel: [<ffffffff80025689>] filldir+0x0/0xb7 Dec 22 02:18:36 athena kernel: [<ffffffff8003527d>] vfs_readdir+0x77/0xa9 Dec 22 02:18:36 athena kernel: [<ffffffff80038b32>] sys_getdents+0x75/0xbd Dec 22 02:18:36 athena kernel: [<ffffffff8005d229>] tracesys+0x71/0xe0 Dec 22 02:18:36 athena kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: 00000000: 79 1c be c8 90 5f fd b9 69 92 e8 96 9d c7 50 76 y.ŸÈ._ý¹i.è..ÇPv Dec 22 02:18:36 athena kernel: Filesystem "md0": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffff88532826 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: Call Trace: Dec 22 02:18:36 athena kernel: [<ffffffff88532725>] :xfs:xfs_da_do_buf+0x503/0x5b1 Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88537b04>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec Dec 22 02:18:36 athena kernel: [<ffffffff88537b04>] :xfs:xfs_dir2_leaf_getdents+0x354/0x5ec Dec 22 02:18:36 athena kernel: [<ffffffff88560d84>] :xfs:xfs_hack_filldir+0x0/0x5b Dec 22 02:18:36 athena kernel: [<ffffffff88560d84>] :xfs:xfs_hack_filldir+0x0/0x5b Dec 22 02:18:36 athena kernel: [<ffffffff88534860>] :xfs:xfs_readdir+0xa7/0xb6 Dec 22 02:18:36 athena kernel: [<ffffffff88561419>] :xfs:xfs_file_readdir+0xff/0x14c Dec 22 02:18:36 athena kernel: [<ffffffff80025689>] filldir+0x0/0xb7 Dec 22 02:18:36 athena kernel: [<ffffffff80025689>] filldir+0x0/0xb7 Dec 22 02:18:36 athena kernel: [<ffffffff8003527d>] vfs_readdir+0x77/0xa9 Dec 22 02:18:36 athena kernel: [<ffffffff80038b32>] sys_getdents+0x75/0xbd Dec 22 02:18:36 athena kernel: [<ffffffff8005d229>] tracesys+0x71/0xe0 Dec 22 02:18:36 athena kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: 00000000: 79 1c be c8 90 5f fd b9 69 92 e8 96 9d c7 50 76 y.ŸÈ._ý¹i.è..ÇPv Dec 22 02:18:36 athena kernel: Filesystem "md0": XFS internal error xfs_da_do_buf(2) at line 2112 of file fs/xfs/xfs_da_btree.c. Caller 0xffffffff88532826 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: Call Trace: Dec 22 02:18:36 athena kernel: [<ffffffff88532725>] :xfs:xfs_da_do_buf+0x503/0x5b1 Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff88532826>] :xfs:xfs_da_read_buf+0x16/0x1b Dec 22 02:18:36 athena kernel: [<ffffffff885388d5>] :xfs:xfs_dir2_leaf_addname+0x3ae/0x761 Dec 22 02:18:36 athena kernel: [<ffffffff885388d5>] :xfs:xfs_dir2_leaf_addname+0x3ae/0x761 Dec 22 02:18:36 athena kernel: [<ffffffff88534e67>] :xfs:xfs_dir_createname+0x132/0x14e Dec 22 02:18:36 athena kernel: [<ffffffff8855a66b>] :xfs:xfs_create+0x2be/0x45c Dec 22 02:18:36 athena kernel: [<ffffffff8851fd3f>] :xfs:xfs_attr_get+0x8e/0x9f Dec 22 02:18:36 athena kernel: [<ffffffff88563e50>] :xfs:xfs_vn_mknod+0x144/0x215 Dec 22 02:18:36 athena kernel: [<ffffffff8003a5ce>] vfs_create+0xe6/0x158 Dec 22 02:18:36 athena kernel: [<ffffffff8001aeed>] open_namei+0x19d/0x6d5 Dec 22 02:18:36 athena kernel: [<ffffffff8002732f>] do_filp_open+0x1c/0x38 Dec 22 02:18:36 athena kernel: [<ffffffff80019d02>] do_sys_open+0x44/0xbe Dec 22 02:18:36 athena kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: Filesystem "md0": XFS internal error xfs_trans_cancel at line 1164 of file fs/xfs/xfs_trans.c. Caller 0xffffffff8855a783 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: Call Trace: Dec 22 02:18:36 athena kernel: [<ffffffff88555b2f>] :xfs:xfs_trans_cancel+0x55/0xfa Dec 22 02:18:36 athena kernel: [<ffffffff8855a783>] :xfs:xfs_create+0x3d6/0x45c Dec 22 02:18:36 athena kernel: [<ffffffff8851fd3f>] :xfs:xfs_attr_get+0x8e/0x9f Dec 22 02:18:36 athena kernel: [<ffffffff88563e50>] :xfs:xfs_vn_mknod+0x144/0x215 Dec 22 02:18:36 athena kernel: [<ffffffff8003a5ce>] vfs_create+0xe6/0x158 Dec 22 02:18:36 athena kernel: [<ffffffff8001aeed>] open_namei+0x19d/0x6d5 Dec 22 02:18:36 athena kernel: [<ffffffff8002732f>] do_filp_open+0x1c/0x38 Dec 22 02:18:36 athena kernel: [<ffffffff80019d02>] do_sys_open+0x44/0xbe Dec 22 02:18:36 athena kernel: [<ffffffff8005d28d>] tracesys+0xd5/0xe0 Dec 22 02:18:36 athena kernel: Dec 22 02:18:36 athena kernel: xfs_force_shutdown(md0,0x8) called from line 1165 of file fs/xfs/xfs_trans.c. Return address = 0xffffffff88555b48 | ||||||||
| Tags | No tags attached. | ||||||||
| Attached Files |
| ||||||||
Notes |
|
|
antonl (reporter) 2009-12-23 12:47 |
Kernel version - 2.6.18-164.9.1.el5.centos.plus #1 SMP Wed Dec 16 11:24:24 EST 2009 x86_64 x86_64 x86_64 GNU/Linux "Problem free" was Linux version 2.6.18-92.1.22.el5.centos.plus |
|
toracat (manager) 2009-12-23 13:06 |
Under CentOS 5.2, the xfs kernel module was provided through an external package, kmod-xfs. As of CentOS 5.4 (kernel >= 2.6.18-164), xfs is enabled in the kernel itself and also it is a newer version. Could you show us the output returned by: rpm -qa kmod\* ls -l `find /lib/modules -name xfs.ko` |
|
antonl (reporter) 2009-12-23 13:52 |
Currently booted under 2.6.18-92.1.22.el5.centos.plus [root@athena ~]# rpm -qa kmod\* kmod-xfs-0.4-2 [root@athena ~]# ls -l `find /lib/modules -name xfs.ko` -rwxr--r-- 1 root root 694704 Dec 16 13:01 /lib/modules/2.6.18-164.9.1.el5.centos.plus/kernel/fs/xfs/xfs.ko lrwxrwxrwx 1 root root 48 Dec 21 05:32 /lib/modules/2.6.18-164.9.1.el5.centos.plus/weak-updates/xfs/xfs.ko -> /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko -rw-r--r-- 1 root root 697232 Oct 3 2008 /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko lrwxrwxrwx 1 root root 48 Mar 12 2009 /lib/modules/2.6.18-92.1.22.el5.centos.plus/weak-updates/xfs/xfs.ko -> /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko lrwxrwxrwx 1 root root 48 Mar 12 2009 /lib/modules/2.6.18-92.el5/weak-updates/xfs/xfs.ko -> /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko Hope it helps. |
|
toracat (manager) 2009-12-23 15:35 |
Possibly related to https://bugzilla.redhat.com/show_bug.cgi?id=512552 . You may want to try dzickus' test kernel referenced in that bugzilla: http://people.redhat.com/dzickus/el5/ |
|
toracat (manager) 2009-12-23 15:48 |
If that test kernel fixes the problem you are seeing and if the patch is not going to be included in the upstream kernel for a while, we can provide it in the centosplus kernel. |
|
antonl (reporter) 2009-12-23 16:03 |
I just installed it. Do you want me to remove XFS module? [root@athena ~]# ls -l `find /lib/modules -name xfs.ko` -rwxr--r-- 1 root root 694832 Dec 15 21:55 /lib/modules/2.6.18-182.el5/kernel/fs/xfs/xfs.ko lrwxrwxrwx 1 root root 48 Dec 23 10:57 /lib/modules/2.6.18-182.el5/weak-updates/xfs/xfs.ko -> /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko -rw-r--r-- 1 root root 697232 Oct 3 2008 /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko lrwxrwxrwx 1 root root 48 Mar 12 2009 /lib/modules/2.6.18-92.1.22.el5.centos.plus/weak-updates/xfs/xfs.ko -> /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko lrwxrwxrwx 1 root root 48 Mar 12 2009 /lib/modules/2.6.18-92.el5/weak-updates/xfs/xfs.ko -> /lib/modules/2.6.18-92.1.13.el5/extra/xfs/xfs.ko |
|
toracat (manager) 2009-12-23 16:16 |
You can leave it for now. The test kernel should be using the in-kernel xfs module. Check with: /sbin/modinfo xfs |
|
antonl (reporter) 2009-12-23 16:20 |
[root@athena ~]# modinfo xfs filename: /lib/modules/2.6.18-182.el5/kernel/fs/xfs/xfs.ko license: GPL description: SGI XFS with ACLs, security attributes, large block/inode numbers, no debug enabled author: Silicon Graphics, Inc. srcversion: DE0AE7E45DF5E1EA03F6EC6 depends: vermagic: 2.6.18-182.el5 SMP mod_unload gcc-4.1 module_sig: 883f3504b284765a0558d65b054fb5711254140a09b3827f295ecd199b12d268c9b24353b74efdb0a09270ef3f8498244c1fb5ef669ea5d17058b4a I will let you know if the issue is resolved in a couple of days. |
|
toracat (manager) 2009-12-23 16:32 |
Hope that fixes the problem. |
|
antonl (reporter) 2009-12-25 23:02 |
I consider that the problem is fixed. At least I did not see a crash even after 8 hours long upload process. If the problem returns I will repost it here. Current kernel is from here http://people.redhat.com/dzickus/el5/ build 182 |
|
toracat (manager) 2009-12-26 01:49 |
Good news. Keep us posted. I will see if I can get the patch(es) into the centosplus kernel. |
|
toracat (manager) 2009-12-26 12:10 |
It is this patch that fixes the issue (appeared in test kernel -179 and newer): 2312 Dec 10 17:56 linux-2.6-md-raid5-mark-cancelled-readahead-bios-with-eio.patch From: Eric Sandeen <sandeen@redhat.com> Date: Tue, 1 Dec 2009 23:24:14 -0500 Subject: [md] raid5: mark cancelled readahead bios with -EIO Message-id: <4B15A59E.6040602@redhat.com> Patchwork-id: 21627 O-Subject: [PATCH RHEL5.5] md raid5: mark cancelled readahead bios with -EIO error Bugzilla: 512552 RH-Acked-by: Doug Ledford <dledford@redhat.com> This is for bug 512552 - Can't write to XFS mount during raid5 resync |
|
toracat (manager) 2009-12-28 16:16 |
The patch referenced in note 10611 will be added to the next centosplus update if: (1) it fixes the issue reported here. (2) it does not appear in the distro kernel until 5.5. I have built the cplus kernel with the patch and made it available from: http://centos.toracat.org/kernel/centos5/centosplus-testing/x86_64/ The name is kernel-2.6.18-164.9.1.kvmmd.el5.ayplus . It was built on top of the previous test cplus kernel with kvm fixes (see bug #4058). So, please test if you can. |
|
antonl (reporter) 2009-12-28 17:54 |
I am going to install your kernel on a computer with almost identical configuration (it was not upgraded yet due to the failure of the first one). Let us give it a week for testing. |
|
antonl (reporter) 2009-12-28 18:12 |
[root@vstorage ~]# uname -a Linux vstorage 2.6.18-164.9.1.kvmmd.el5.ayplus #1 SMP Sat Dec 26 12:28:00 PST 2009 x86_64 x86_64 x86_64 GNU/Linux |
|
toracat (manager) 2009-12-28 19:00 |
@antonl, Thanks for testing. We look forward to seeing the result in a week or so. Hope that is all the fix needed. |
|
antonl (reporter) 2010-01-09 03:44 |
I consider the issue resolved. Since my last note both servers worked fine without any problems reported. Thank you toracat for the quick resolution! |
|
toracat (manager) 2010-01-09 04:53 |
Thanks for the good news. As if we were waiting for this moment ... the patch is now included in the centosplus kernel update released today ( kernel-2.6.18-164.10.1.el5.centos.plus ). The cplus kernel will continue to provide the fix until the patch finally appears in the distro kernel (possibly in CentOS 5.5). |
|
toracat (manager) 2010-01-14 00:33 |
Changing the status to "resolved". |
Issue History |
|||
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2009-12-22 22:36 | antonl | New Issue | |
| 2009-12-23 12:47 | antonl | Note Added: 0010583 | |
| 2009-12-23 13:06 | toracat | Note Added: 0010584 | |
| 2009-12-23 13:52 | antonl | Note Added: 0010585 | |
| 2009-12-23 15:35 | toracat | Note Added: 0010586 | |
| 2009-12-23 15:41 | toracat | Status | new => acknowledged |
| 2009-12-23 15:41 | toracat | Category | CentOS-5-Plus => kernel |
| 2009-12-23 15:48 | toracat | Note Added: 0010587 | |
| 2009-12-23 16:03 | antonl | Note Added: 0010588 | |
| 2009-12-23 16:16 | toracat | Note Added: 0010589 | |
| 2009-12-23 16:20 | antonl | Note Added: 0010590 | |
| 2009-12-23 16:32 | toracat | Note Added: 0010591 | |
| 2009-12-25 23:02 | antonl | Note Added: 0010609 | |
| 2009-12-26 01:49 | toracat | Note Added: 0010610 | |
| 2009-12-26 12:10 | toracat | Note Added: 0010611 | |
| 2009-12-28 16:16 | toracat | Note Added: 0010617 | |
| 2009-12-28 17:54 | antonl | Note Added: 0010619 | |
| 2009-12-28 18:12 | antonl | Note Added: 0010620 | |
| 2009-12-28 19:00 | toracat | Note Added: 0010621 | |
| 2010-01-09 03:44 | antonl | Note Added: 0010726 | |
| 2010-01-09 04:53 | toracat | Note Added: 0010727 | |
| 2010-01-14 00:33 | toracat | Note Added: 0010767 | |
| 2010-01-14 00:33 | toracat | Status | acknowledged => resolved |
| 2010-01-14 00:33 | toracat | Resolution | open => fixed |


