CentOS Bug Tracker
CentOS Website

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0003869 [CentOS-5] kernel-PAE crash have not tried 2009-09-28 14:34 2010-02-04 21:13
Reporter TrevorH View Status public  
Assigned To
Priority normal Resolution fixed  
Status resolved   Product Version 5.3
Summary 0003869: kernel panic at kmap_atomic+0x72/0xbb
Description Upgraded to kernel-PAE-2.6.18-164.el5 last week. Got a nasty kernel panic this week, first in 2 years on same hardware.

esi: 0000000f edi: f76f2040 ebp: c0014c88 esp:e2103e10
ls: 007b es: 007b ss: 0068
Process httpd (pid: 19860, ti=e2103000 task.ti=e2103000)
Call Trace:
 [<c0461457>] __handle_mm_fault+0x103/0xcfe
 [<c045fe50>] do_wp_page+0x60f/0x665
 [<c05b0436>] do_sock_read+0xbe/0xf7
 [<c04c3d94>] avc_has_perm+0x3a/0x44
 [<c0461fa4>] __handle_mm_fault+0xc50/0xcfe
 [<c06184c5>] do_page_fault+0x2d9/0x607
 [<c06181ec>] do_page_fault+0x0/0x607
 [<c0405a89>] error_code+0x39/0x40
 [<c0497ba6>] sys_epoll_wait+0x257/0x38d
 [<c041e847>] default_wake_function+0x0/0xc
 [<c0404f17>] syscall_call+0x7/0xb
 =======================
Code: ff ff 6b 50 10 1b c7 04 24 00 f0 ff ff 8d 14 16 8d 42 44 c1 e2 03 c1 e2 03 c1 e0 0c 29 d5 29 04 24 83 7d 00 00 8b 45 04 75 04 85 c0 74 08 <0f> 0b 2b 00 a9 73 63 c0 89 c8 8b 35 38 f2 78 c0 2b 05 10 cf 7b
EIP: [<c041cca9>] kmap_atomic+0x72/0xbb SS:ESP: 0068:e2103e10
 <0> Kernel panic - not synching: Fatal exception

Transcribed so google can find it but jpg attached in case I made transscription errors in above!
Additional Information Have left 2.6.18-164 in for now to see if the same problem recurs.

Selinux is in permissive mode on this server.

Hardware: Dell PE 860 with dual core Xeon 3050 @ 2.13GHz, 4GB RAM.
Tags No tags attached.
Attached Files png file icon kernel-panic.png [^] (100,128 bytes) 2009-09-28 14:34
png file icon Untitled.png [^] (187,839 bytes) 2009-10-22 09:00
jpg file icon panic.JPG [^] (92,834 bytes) 2009-12-04 23:40

- Relationships

-  Notes
(0010103)
highking (reporter)
2009-10-22 08:56

We seem to have the same problem with kernel-2.6.18-164.2.1.el5 (no PAE) on two virtual machines on our VMware ESX-platform.

Those machines never had any problems before the updates we installed last week (before they were running kernel-2.6.18-128.el5).
(0010104)
TrevorH (reporter)
2009-10-22 09:06

I have also had a 2nd occurance of the same crash and it seems to be sparked by the same thing. At the time that it panics, it is a minute or two after my cron job to back up our 40GB Cyrus IMAP mail store starts. This uses lvm to create a snapshot and then backs up the snapshot. Not sure if this panic is to do with the snapshotting or to do with lots of access to files.
(0010105)
highking (reporter)
2009-10-22 09:16

Oops... while about 98% of our Linux servers run CenOS, exactly the two servers on which we have seen those panics run RHEL...

Think we need to bother Redhat with this issue instead of the CentOS guys. Sorry...
(0010325)
jfchevrette (reporter)
2009-11-09 15:38

We have the same problem with Vmware ESX 3.5 VMs running kernel-2.6.18-164.6.1.el5

VMs crash when IO-intensive tasks are ran. In our case, this occur when a backup snapshot is made and sent to a remote server using CDP technology and kernel module from a product called R1Soft.

I checked from an upstream bug report but didn't find any.

Anyone?
(0010388)
chenull (reporter)
2009-11-20 17:21

Same issue here, CentOS 2.6.18-164.el5PAE
CPU: Intel(R) Xeon(R) CPU E5430 @ 2.66GHz

I have 12 blades on running the same Hardware (Supermicro superblades, same RAMs) + same OS + same software (some cpanel, some openvz), this issue only happen on two machines. in one machine, this happen nearly once every 3 days. in 10 other machines, just fine, no kernel picnic, eh, panic :D

CPU temperater + voltage seems normal
(0010389)
toracat (developer)
2009-11-20 17:45

>Oops... while about 98% of our Linux servers run CenOS, exactly the two servers
>on which we have seen those panics run RHEL...
>Think we need to bother Redhat with this issue instead of the CentOS guys.
>Sorry...

Could you please file a report at http://bugzilla.redhat.com [^] ? If this has already been done, providing a link here would be helpful.
(0010411)
TrevorH (reporter)
2009-11-23 13:32

I could not find an existing bug upstream but that doesn't mean there isn't one!

I am not sure if this problem is due to the use of LVM snapshots or large amounts of I/O on the server so it would be helpful if those also suffering the crash could eliminate one or other or both of these from the equation.
(0010412)
amf (reporter)
2009-11-23 13:39

We have seen something very similar to this, with the R1Soft product, as mentioned in #0010325.

I would suggest, from the reading I've done, it's not LVM specific.

You may want to read through the following thread on the R1Soft forums: http://tinyurl.com/yehncfq [^]

Making the changes to the timeouts as suggested has mitigated the problem for us.
(0010419)
gvard (reporter)
2009-11-25 18:28

Hello,

The timeout settings mentioned in R1Soft forum didn't help me; it made the problem dissapear for about a week and then I had another kernel panic. Unfortunately I had to downgrade to 2.6.18-127 as mentioned in R1Soft forum until this problem goes away.
(0010438)
amf (reporter)
2009-12-02 11:03

Same here actually, machines which were okay with the workaround settings are now beginning to show recurrences of the problems.

Does anyone know if there's a RH BZ for this, or know if RH are aware?
(0010460)
minadreapta (reporter)
2009-12-04 23:45

i added a picture too, maybe it helps.
(0010479)
minadreapta (reporter)
2009-12-08 23:44

did anyone find a solution to this? it's happening to some machines every 2 or 3 days now, and it's very annoying.
thanks.
(0010481)
highking (reporter)
2009-12-09 08:21

@toracat: I have not filed a bug, but did open a support ticket and sent RH a crashdump. Yesterday I received a test kernel which should resolve this issue.
I have installed that kernel on 2 machines that panic'd quite often, but this was yesterday... can't say if it realy did the trick after just one day.

(kernel-2.6.18-175.el5.bz541956.i686.rpm, but I don't think RH would like it if I paste the full download url here... ;-))
(0010482)
amf (reporter)
2009-12-09 11:19

@highking: thanks for the feedback from your experiences. If this works out for you, would you be able to ask RH when the fix is expected to make it into the official kernel?
(0010496)
highking (reporter)
2009-12-10 15:08

I guess if it realy fixes the issue it will be in an updated kernel anyway. I will ask though.

Both machines running on this kernel have not panic'd for 2 days now, but the panics only happened once every one or two weeks, so it's hard to tell if the issue realy is fixed just now.
(0010498)
toracat (developer)
2009-12-10 15:57

@highking,

Thanks for opening a ticket with RH and reporting back with the response. Hope the fix gets incorporated in a future kernel update soon.
(0010499)
minadreapta (reporter)
2009-12-10 16:33

on one of my centos 5.4 machines (2.6.18-164.6.1.el5PAE) it happens every 3 or 4 days, sometimes 2 days in a row...
on other 2 machines with the same OS and kernel it happens every week or two.
(0010526)
highking (reporter)
2009-12-14 12:33

Just got this update to my ticket (and my question whether or not this kernel will be supplied as an update):

As I mentioned earlier provided kernel package is a test and will be replaced by a Errata kernel with release of Red Hat Enterprise Linux 5.5 release. I would also like to mention that this fix is also under review for Red Hat Enterprise Linux 5.4.z which means it can be provided by a minor update.
(0010548)
minadreapta (reporter)
2009-12-18 09:16

i've just upgraded the kernel to 2.6.18-164.9.1.el5

anyone using this kernel is having problems anymore?
(0010596)
toracat (developer)
2009-12-24 16:18

People who are experiencing the problem should try upstream's test kernel -178 or newer. See:

https://bugzilla.redhat.com/show_bug.cgi?id=541956 [^]

for details. Actually, the latest test kernel is at:

http://people.redhat.com/jwilson/el5/ [^]
(0010691)
gvard (reporter)
2010-01-06 16:14

Greetings from Greece,

Has anyone tried the kernel toracat suggested and saw if the problem is solved?
(0010892)
jfchevrette (reporter)
2010-01-29 15:52

It looks like the latest kernel has the patch that fixes this issue

kernel-2.6.18-164.11.1.el5

- [mm] call vfs_check_frozen after unlocking the spinlock (Amerigo Wang) [548370 541956]

We have been running this kernel for the past few days and no problems so far.
(0010894)
toracat (developer)
2010-01-29 16:11

To other people experiencing this issue, could you try the latest kernel and see if that solves the problem for you?
(0010895)
highking (reporter)
2010-01-29 18:19

Well, in the support ticket I had at RH they told me that 2.6.18-164.11.1 should indeed solve the issue. Didn't have time to test it one the machines that had this problem in the first place though.
(0010896)
minadreapta (reporter)
2010-01-29 18:36

i've been using 2.6.18-164.11.1 since Wed Jan 20 08:16:13 EST 2010 and no problems so far (for the last 9 days).
(0010897)
toracat (developer)
2010-01-29 18:56

I think we have enough "evidence" that the fix is indeed in kernel -2.6.18-164.11.1.el5, so we can mark this ticket "resolved". I am going to leave this open for now just in case people want to add more notes.
(0010931)
toracat (developer)
2010-02-04 21:13

Closing as "resolved". I really wanted to hear from the OP. Thanks everyone for helping/reporting.

- Issue History
Date Modified Username Field Change
2009-09-28 14:34 TrevorH New Issue
2009-09-28 14:34 TrevorH Assigned To => kbsingh@karan.org
2009-09-28 14:34 TrevorH File Added: kernel-panic.png
2009-10-22 08:56 highking Note Added: 0010103
2009-10-22 09:00 highking File Added: Untitled.png
2009-10-22 09:06 TrevorH Note Added: 0010104
2009-10-22 09:16 highking Note Added: 0010105
2009-11-09 15:38 jfchevrette Note Added: 0010325
2009-11-09 20:19 amf Issue Monitored: amf
2009-11-20 17:21 chenull Note Added: 0010388
2009-11-20 17:22 chenull Issue Monitored: chenull
2009-11-20 17:45 toracat Note Added: 0010389
2009-11-23 13:32 TrevorH Note Added: 0010411
2009-11-23 13:39 amf Note Added: 0010412
2009-11-25 18:28 gvard Note Added: 0010419
2009-11-25 18:33 gvard Issue Monitored: gvard
2009-12-02 11:03 amf Note Added: 0010438
2009-12-04 23:40 minadreapta File Added: panic.JPG
2009-12-04 23:44 minadreapta Issue Monitored: minadreapta
2009-12-04 23:45 minadreapta Note Added: 0010460
2009-12-08 23:44 minadreapta Note Added: 0010479
2009-12-09 08:21 highking Note Added: 0010481
2009-12-09 11:19 amf Note Added: 0010482
2009-12-10 15:08 highking Note Added: 0010496
2009-12-10 15:57 toracat Note Added: 0010498
2009-12-10 16:33 minadreapta Note Added: 0010499
2009-12-14 12:33 highking Note Added: 0010526
2009-12-18 09:16 minadreapta Note Added: 0010548
2009-12-24 16:18 toracat Note Added: 0010596
2009-12-24 16:21 toracat Assigned To kbsingh@karan.org => toracat
2009-12-24 16:21 toracat Status new => acknowledged
2010-01-06 16:14 gvard Note Added: 0010691
2010-01-29 15:52 jfchevrette Note Added: 0010892
2010-01-29 16:11 toracat Note Added: 0010894
2010-01-29 16:11 toracat Status acknowledged => feedback
2010-01-29 18:19 highking Note Added: 0010895
2010-01-29 18:36 minadreapta Note Added: 0010896
2010-01-29 18:56 toracat Note Added: 0010897
2010-01-29 18:56 toracat Status feedback => confirmed
2010-01-29 18:56 toracat Resolution open => fixed
2010-01-29 18:56 toracat Fixed in Version => 5.4
2010-02-04 21:13 toracat Note Added: 0010931
2010-02-04 21:13 toracat Status confirmed => resolved


Copyright © 2000 - 2009 Mantis Group
Powered by Mantis Bugtracker