View Issue Details

IDProjectCategoryView StatusLast Update
0009209CentOS-6kernelpublic2018-03-25 16:11
Reporterc64whiz 
PriorityhighSeveritycrashReproducibilityalways
Status resolvedResolutionfixed 
Product Version 
Target VersionFixed in Version6.8 
Summary0009209: Kernel Panic at boot after 6.6 Final upgrade to 6.7
DescriptionFor a few days, my 6.6 (Final) has told me there were several updates to perform which led me to believe it was a version upgrade (6.6 to 6.7), which it was. Upgrading was performed via "yum update" and during the process I was warned my "/etc/sysctl.conf" was installed as "/etc/sysctl.conf.rpmnew".

After the update, I "reboot"ed and the system crashed just after the grub menu stating "BUG: unable to handle kernel NULL pointer dereference at 00000018" (full error screen can be seen in attached photo.)

The kernel version that was installed is "2.6.32-573.1.1.el6.i686". I *am* successfully able to boot the system using the previous kernel "2.6.32-504.30.3.el6.i686".
Steps To ReproduceNothing special needed, just reboot the machine and do not select a different kernel from the grub menu.
Additional InformationI've inspected the /etc/sysctl.conf file and the only difference with my earlier version were some IPv6 settings...no difference otherwise.

TagsNo tags attached.

Relationships

has duplicate 0009374 closedIssue Tracker Kernel Panic at boot after 6.6 Final upgrade to 6.7 

Activities

c64whiz

c64whiz

2015-08-10 00:15

reporter  

IMG_20150809_165440.jpg (894,920 bytes)
Soruk42

Soruk42

2015-08-10 21:02

reporter  

oops-2.6.32-573.1.1.jpg (105,552 bytes)
oops-2.6.32-573.1.1.jpg (105,552 bytes)
Soruk42

Soruk42

2015-08-10 21:05

reporter   ~0023835

Added file "oops-2.6.32-573.1.1.jpg".

I'm seeing this issue too. And, like the OP, I can boot my system with my previous kernel - 2.6.32-504.23.4 - in my case I'm running x86-64 whereas the OP is running i686.

(Apologies for the non-standard font, used an 8x8 font to get the info that was otherwise scrolling off screen.)
Soruk42

Soruk42

2015-08-10 21:17

reporter   ~0023836

Breakthrough - adding "i915.enabled_execlists=0" to my kernel boot parameters is allowing my system to boot with my new kernel.

Tip found at the bottom of this page https://wiki.archlinux.org/index.php/Intel_graphics - was this backported?
aguimont

aguimont

2015-08-10 21:19

reporter   ~0023837

I also ran into this issue and found that 2.6.32-573.1.1 was only partially installed (I was missing /boot/initramfs-2.6.32-573.1.1.el6.x86_64.img)

After running "yum reinstall kernel" I was able to boot into 2.6.32-573.1.1.
Soruk42

Soruk42

2015-08-10 21:53

reporter   ~0023838

And it gets more bizarre. With that parameter, my system boots but dmesg shows the i915 driver complaining it's an unknown parameter, and without it it crashes.

UPDATE: That's because according to modinfo, it's i915.enable_execlists - and with the correctly entered parameter, it crashes.

Maybe it's the side-effect of the broken parameter causing the module to not load, thus allowing the system to boot.

UPDATE 2: I've rebuilt my initramfs to remove all references to i915 and blacklist it in /etc/modprobe.d - but for quick workarounds putting a duff i915 entry in the kernel boot parameters seems to be enough to stop the module from loading (and crashing the machine).
pete@ucwh.net

pete@ucwh.net

2015-08-13 03:11

reporter   ~0023861

soruk42's solution given at https://bugs.centos.org/view.php?id=9209#c23836 also
works for i686 machines.

I used this kernel command line in grub.conf:

ro root=/dev/mapper/vg_mpilo-lv_root nomodeset rd_NO_LUKS LANG=en_US.UTF-8 rd_LVM_LV=vg_mpilo/lv_root rd_NO_MD SYSFONT=latarcyrheb-sun16 rd_NO_DM KEYBOARDTYPE=pc KEYTABLE=us i915.enabled_execlists=0 rd_LVM_LV=vg_mpilo/lv_swap rhgb quiet

and the machine boots fine. Machine/OS are:

[me@mpilo ~]$ uname -a
Linux mpilo.ucwh.net 2.6.32-573.1.1.el6.i686 #1 SMP Sat Jul 25 14:00:46 UTC 2015 i686 i686 i386 GNU/Linux
Soruk42

Soruk42

2015-08-13 13:30

reporter   ~0023867

It's not that config option as such that's fixing it - as my follow-up message shows that option name is actually wrong, so this unrecognised option is causing the i915 module to fail to load, and by not loading it doesn't then crash your machine.

I've changed mine to the rather tongue-in-cheek setting of "i915.crash_on_boot=0". :-)
peterqz

peterqz

2015-08-13 18:59

reporter   ~0023876

Same issue here.
Centos 6.7 32 bits.
kernel 2.6.32-573.1.1.
It wasn't enable to boot successfully. Until than i915.enabled_execlists=0 is added to kernel boot parameters and dmesg shows "enabled_execlists=0 an unknown parameter", only just do work. Ok.

Reinstall kernel doesn't work for me.

I found, config-2.6.32-573.1.1.el6 file:
> # CONFIG_DRM_GMA500 is not set (maybe it's not built on kernel)

And lsmod on kernel 2.6.32-573.1.1: i915 is not present while kernel 2.6.32-504.30.3 is it.

Maybe this issue should be reported to bugzilla.

Attach differences between kernels.
peterqz

peterqz

2015-08-13 18:59

reporter  

differents.txt (2,048 bytes)
3,4c3,4
< # Linux kernel version: 2.6.32-504.30.3.el6.i686
< # Wed Jul 15 10:50:51 2015
---
> # Linux kernel version: 2.6.32-573.1.1.el6.i686
> # Sat Jul 25 13:55:24 2015
14a15
> CONFIG_CLOCKSOURCE_VALIDATE_LAST_CYCLE=y
247a249
> CONFIG_GENERIC_CLOCKEVENTS_MIN_ADJUST=y
1133a1136
> CONFIG_WIRELESS_EXT_SYSFS=y
1143c1146
< # CONFIG_MAC80211_RC_DEFAULT_PID is not set
---
> # CONFIG_MAC80211_RC_MINSTREL_VHT is not set
1647a1651
> CONFIG_DM_SWITCH=m
1782d1785
< CONFIG_FORCEDETH_NAPI=y
1826d1828
< CONFIG_R8169_VLAN=y
1846d1847
< CONFIG_CHELSIO_T3_DEPENDS=y
1848d1848
< CONFIG_CHELSIO_T4_DEPENDS=y
1849a1850
> CONFIG_CHELSIO_T4VF=m
1860c1861,1862
< # CONFIG_I40E_DCB is not set
---
> CONFIG_I40E_DCB=y
> # CONFIG_I40E_FCOE is not set
1935c1937,1941
< # CONFIG_ATH9K_LEGACY_RATE_CONTROL is not set
---
> # CONFIG_ATH9K_DYNACK is not set
> CONFIG_ATH9K_WOW=y
> CONFIG_ATH9K_RFKILL=y
> # CONFIG_ATH9K_CHANNEL_CONTEXT is not set
> CONFIG_ATH9K_PCOEM=y
1953a1960
> CONFIG_IWLWIFI_LEDS=y
1956a1964,1965
> # CONFIG_IWLWIFI_BCAST_FILTERING is not set
> # CONFIG_IWLWIFI_UAPSD is not set
1963d1971
< # CONFIG_IWLWIFI_P2P is not set
1980d1987
< # CONFIG_B43_BCMA_EXTRA is not set
1981a1989,1991
> CONFIG_B43_BUSES_BCMA_AND_SSB=y
> # CONFIG_B43_BUSES_BCMA is not set
> # CONFIG_B43_BUSES_SSB is not set
1987a1998
> CONFIG_B43_PHY_G=y
2008a2020
> # CONFIG_WL18XX is not set
2031a2044
> CONFIG_RT2800USB_RT3573=y
2035a2049
> CONFIG_RT2800_LIB_MMIO=m
2481c2495
< CONFIG_SERIAL_8250_NR_UARTS=32
---
> CONFIG_SERIAL_8250_NR_UARTS=64
2854a2869
> # CONFIG_SSB_DRIVER_GPIO is not set
2863a2879
> # CONFIG_BCMA_HOST_SOC is not set
2874a2891
> CONFIG_MFD_RTSX_USB=m
3325a3343
> # CONFIG_DRM_GMA500 is not set
3594a3613
> CONFIG_SND_HDA_I915=y
3635a3655,3656
> CONFIG_SND_USB_HIFACE=m
> CONFIG_SND_BCD2000=m
3900a3922
> CONFIG_MMC_REALTEK_USB=m
4557a4580,4581
> # CONFIG_PRIO_TREE_TEST is not set
> # CONFIG_INTERVAL_TREE_TEST is not set
4571a4596
> # CONFIG_TEST_STRING_HELPERS is not set
4785a4811
> # CONFIG_CRC32_SELFTEST is not set
4802a4829
> CONFIG_INTERVAL_TREE=y
differents.txt (2,048 bytes)
Soruk42

Soruk42

2015-08-13 20:52

reporter   ~0023877

@peterqz The reason i915 isn't present when you've booted the new kernel with that parameter is, since that parameter is actually incorrect, the module fails to load with that unknown parameter error. By not loading, it's thus not crashing your machine (hence my rather tongue-in-cheek entry above yours).
dead_paulie

dead_paulie

2015-08-14 14:03

reporter   ~0023889

Seems to be a problem with 2.6.32-573.3.1, too.
divaddrof

divaddrof

2015-08-15 16:38

reporter   ~0023901

2.6.32-573.3.1
crashes on reboot with a message about something tainted - sorry I missed it.

Rebooted using previous (2.6.32-504.30.3) and erased and installed - think I will just move it to the bottom of grub.conf until the next update.

This server runs headless - so it's a real pain when everything just dies.

David
jeegee

jeegee

2015-08-15 19:29

reporter   ~0023903

Same here, kernel panic with 2.6.32-573.1.1.
Went back to which is running fine.
Didn't try 2.6.32-573.3.1 yet.
bgpc

bgpc

2015-08-15 20:02

reporter   ~0023904

Same here
bgpc

bgpc

2015-08-15 20:05

reporter  

IMG_7785.JPG (485,467 bytes)
bgpc

bgpc

2015-08-15 20:09

reporter   ~0023906

Sorry, too fast on the add.
Same issue here.
 Saw the panic on boot with 573.1.1 and reinstalled kernel. no help
Installed latest release today 573.3.1 and still have the panic.
Boot to old kernel 504.30.3 without issue.

Panic screen capture above if it helps.
(https://bugs.centos.org/file_download.php?file_id=6894&type=bug)
puleglot

puleglot

2015-08-16 12:44

reporter   ~0023909

This issue is caused by "nomodeset" option. It seems that updated i915 driver is incompatible with it.
jeegee

jeegee

2015-08-16 13:04

reporter   ~0023910

Great. Some info. Now wait for a fix.
c64whiz

c64whiz

2015-08-18 01:48

reporter  

newimg.jpg (845,383 bytes)
c64whiz

c64whiz

2015-08-18 01:50

reporter   ~0023932

Just an update: I just applied the 573.3.1 update and same crash. I uploaded a new image "newimg.jpg" which, for the most part, is the same as the original. However, a couple of values (pointers) are different. I didn't know if they were significant so I wanted to provide a pic (sorry for the blur)!
kstange

kstange

2015-08-18 20:42

reporter   ~0023943

Since I didn't see any indication that this issue has been reported to Red Hat yet, I created bug #1254699 in Red Hat's Bugzilla. The bug is private, as is standard for kernel bugs these days.

I'll provide updates back to this issue if I see anything noteworthy.
jeegee

jeegee

2015-08-21 19:13

reporter   ~0023989

Any news?
kstange

kstange

2015-08-21 19:37

reporter   ~0023990

Nothing so far, still waiting.
jeegee

jeegee

2015-08-30 19:41

reporter   ~0024061

Still nothing?
Giuseppe Ragusa

Giuseppe Ragusa

2015-09-10 19:49

reporter   ~0024316

I can confirm the above reported panic on all the CentOS 6.7 kernels published so far; latest CentOS 6.6 kernel is immune.

I can also confirm that:

*) removing the "nomodeset" option from the kernel commandline makes the boot proceed with no panic (but resulting in a high resolution text console)

*) disabling the i915 module loading on the kernel commandline (by passing any bogus unrecognized option like "i915.crash_on_me=0") makes the boot proceed with no panic even keeping the "nomodeset" option (and resulting in a 80x25 text console)
Soruk42

Soruk42

2015-09-24 06:29

reporter   ~0024433

The new 573.7.1 is similarly broken, still need to either remove "nomodeset" or have a dud option like i915.crash_on_boot=0 to allow booting.
toracat

toracat

2015-09-28 16:27

manager   ~0024476

Last edited: 2015-09-28 16:34

View 3 revisions

There was a clone (dup, now closed) of this bug report. Two notes were posted there:

(1) https://bugs.centos.org/view.php?id=9374#c24378 (posted by kabe)

Investigated and came up with a patch:
It seems that
drivers/gpu/drm/i915/intel_ringbuffer.c:intel_cleanup_ring_buffer()
doesn't check NULL pointer and thus panics when "nomodeset" kernel
option is set.

Patch attached:
https://bugs.centos.org/file_download.php?file_id=7744&type=bug [^]

At least this patch will make "nomodeset" boot properly. Xserver works.
I'm not sure whether the intel_cleanup_ring_buffer() itself should or shouldn't
be called when "nomodeset" is in effect.

(2) https://bugs.centos.org/view.php?id=9374#c24386 (posted by kstange)

There has literally been not even so much as an acknowledgement of the bug report so far. I will update the bug with the new information.

----------
NOTE: The patch submitted by kabe has been uploaded in this bug report.

toracat

toracat

2015-09-28 16:31

manager  

patch-i915-nomodeset-panic.patch (1,222 bytes)
On boot, i915.ko panics when "nomodeset" or "i915.modeset=-1" kernel option is set.
Workaround: add bogus "i915.crash=0" option to not load i915.ko .
Patch below: properly check the NULL pointer dereference.

diff -up linux-2.6.32-573.3.1.el6.emu686.v19.i586/drivers/gpu/drm/i915/intel_ringbuffer.c.v19+ linux-2.6.32-573.3.1.el6.emu686.v19.i586/drivers/gpu/drm/i915/intel_ringbuffer.c
--- linux-2.6.32-573.3.1.el6.emu686.v19.i586/drivers/gpu/drm/i915/intel_ringbuffer.c.v19+	2015-08-10 22:16:43.000000000 +0900
+++ linux-2.6.32-573.3.1.el6.emu686.v19.i586/drivers/gpu/drm/i915/intel_ringbuffer.c	2015-09-18 13:25:30.000000000 +0900
@@ -1821,12 +1821,18 @@ error:
 
 void intel_cleanup_ring_buffer(struct intel_engine_cs *ring)
 {
-	struct drm_i915_private *dev_priv = to_i915(ring->dev);
-	struct intel_ringbuffer *ringbuf = ring->buffer;
+	struct drm_i915_private *dev_priv;
+	struct intel_ringbuffer *ringbuf;
 
+	/* do not dereference NULL! */
+	if (ring == NULL)
+		return;
 	if (!intel_ring_initialized(ring))
 		return;
 
+	dev_priv = to_i915(ring->dev);	/*ring->dev->private_*/
+	ringbuf = ring->buffer;
+
 	intel_stop_ring_buffer(ring);
 	WARN_ON(!IS_GEN2(ring->dev) && (I915_READ_MODE(ring) & MODE_IDLE) == 0);
 
Soruk42

Soruk42

2015-11-30 10:16

reporter   ~0024942

The 573.8.1 kernel is also similarly broken.
kstange

kstange

2016-01-15 15:52

reporter   ~0025371

I'm "pleased" to report that after about 5 months, Red Hat has finally responded with the following unhelpful question:

"I think you may want i915.modeset=0 ?"

I don't know how this is supposed to help.
kstange

kstange

2016-01-19 16:08

reporter   ~0025415

Red Hat has replied to indicate that nomodeset is a non-default option which is not recommended. TrevorH pointed out to me in #centos-devel that many of us have been adding nomodeset to our command lines for years due to a workaround for some other issue so old I don't even remember what it was.

Red Hat's response so far is to simply not use nomodeset, though I'm pushing them to fix the panic anyway, since it's a regression. I don't expect them to bother since they can more easily brush it off as user error.

nomodeset is supposed to prevent the kernel from loading the i915 driver entirely, which apparently is what's broken in the 6.7 kernel.
robdclark

robdclark

2016-01-19 21:37

reporter   ~0025426

note: I have changed the rh bz to not be private (I'm not entirely sure why it was private to start with, maybe that was just a default setting?)

To paraphrase my answer on the rh bz (https://bugzilla.redhat.com/show_bug.cgi?id=1254699#c10) my suggestion is to disable CONFIG_DRM_I915_UMS in the kernel config.

(The change is simple enough, although since it is about an "unsupported" configuration I'm not sure that it would get approval to go into z-stream kernel.)
toracat

toracat

2016-01-20 07:35

manager   ~0025429

I have built the latest kernel (2.6.32-573.12.1.el6) with CONFIG_DRM_I915_UMS disabled. The packages are available here:

http://people.centos.org/toracat/kernel/6/distro/bug9209/

Please note that they are not signed and are provided for testing purposes only.
T1loc

T1loc

2016-01-22 08:52

reporter  

IMG_20160122_094150.jpg (1,881,573 bytes)
T1loc

T1loc

2016-01-22 08:54

reporter   ~0025463

I tried the new kernel without success. I added an attachment with the call trace.
toracat

toracat

2016-01-22 13:23

manager   ~0025465

@T1loc

Thanks for testing. I want to ask one thing. With the regular kernel that triggers the crash, do you see the call trace that looks similar to the one you just posted? If so, your cause may not be the same as what is reported here.
robdclark

robdclark

2016-01-22 13:38

reporter   ~0025467

@T1loc, that looks like a pretty different issue.. probably should be a different bug id. It doesn't look obviously i915 related. But I'd also like to know if that is the same splat you were getting before with the regular kernel, and also if it goes away without 'nomodeset'.
T1loc

T1loc

2016-01-22 13:41

reporter   ~0025468

Ok, So I'll test with nomodeset and the regular kernel.

If I get an another kernel panic, I'll update the thread.
toracat

toracat

2016-02-21 04:23

manager   ~0025792

The latest centosplus kernel (kernel-2.6.32-573.18.1.el6.centos.plus) now has CONFIG_DRM_I915_UMS disabled.
jeegee

jeegee

2016-02-26 21:30

reporter   ~0025861

What about: 2.6.32-573.18.1.el6? From base/updates.

Will this fix the issue finally?
dead_paulie

dead_paulie

2016-04-11 11:23

reporter   ~0026253

I updated to kernel-2.6.32-573.22.1.el6.i686.

I still encounter the kernel panic if I leave the grub.conf alone with nomodeset in the kernel line.

If I remove nomodeset, the system will boot into runlevel 3 without a problem. I cannot, however, manually start X.

I am not able to completely boot into runlevel 5, either. The boot console sits idle, X never starts, and (obviously) the gdm screen is never shown. I have full remote access, however, via ssh.

The message in /var/log/Xorg.2.log is:
Fatal server error:
[ 45.193] (EE) no screens found(EE)

Is X broke without the nomodeset kernel parameter, or do I need to modify some X configuration?
robdclark

robdclark

2016-04-11 11:57

reporter   ~0026254

dead_paulie, suppose you could attach Xorg log? Do you have some existing xorg .conf file(s)? If so you might need to remove them. At least with a modern xorg, with modesetting enabled on kernel side (ie. no nomodesetting), if there is no other ddx driver for your hw it should fall back to the generic modesetting ddx driver.
Soruk42

Soruk42

2016-05-26 20:48

reporter   ~0026703

Update after upgrading to the 2.6.32-642 kernel for CentOS 6.8, the nomodeset bug appears fixed.
toracat

toracat

2016-05-27 04:05

manager   ~0026704

In the 6.8 GA kernel (2.6.32-642), the DRM_I915_UMS option does not exist in the config. This must have fixed the issue as noted by Soruk42.
toracat

toracat

2016-11-28 18:39

manager   ~0028026

I am marking this ticket 'resolved'. (Finally!)
Thanks everyone that was involved.

Issue History

Date Modified Username Field Change
2015-08-10 00:15 c64whiz New Issue
2015-08-10 00:15 c64whiz File Added: IMG_20150809_165440.jpg
2015-08-10 21:02 Soruk42 File Added: oops-2.6.32-573.1.1.jpg
2015-08-10 21:05 Soruk42 Note Added: 0023835
2015-08-10 21:17 Soruk42 Note Added: 0023836
2015-08-10 21:19 aguimont Note Added: 0023837
2015-08-10 21:53 Soruk42 Note Added: 0023838
2015-08-13 03:11 pete@ucwh.net Note Added: 0023861
2015-08-13 13:30 Soruk42 Note Added: 0023867
2015-08-13 18:59 peterqz Note Added: 0023876
2015-08-13 18:59 peterqz File Added: differents.txt
2015-08-13 20:52 Soruk42 Note Added: 0023877
2015-08-14 14:03 dead_paulie Note Added: 0023889
2015-08-15 16:38 divaddrof Note Added: 0023901
2015-08-15 19:29 jeegee Note Added: 0023903
2015-08-15 20:02 bgpc Note Added: 0023904
2015-08-15 20:05 bgpc File Added: IMG_7785.JPG
2015-08-15 20:09 bgpc Note Added: 0023906
2015-08-16 12:44 puleglot Note Added: 0023909
2015-08-16 13:04 jeegee Note Added: 0023910
2015-08-18 01:48 c64whiz File Added: newimg.jpg
2015-08-18 01:50 c64whiz Note Added: 0023932
2015-08-18 20:42 kstange Note Added: 0023943
2015-08-21 19:13 jeegee Note Added: 0023989
2015-08-21 19:37 kstange Note Added: 0023990
2015-08-30 19:41 jeegee Note Added: 0024061
2015-09-01 23:02 Nomii Issue cloned: 0009374
2015-09-10 19:49 Giuseppe Ragusa Note Added: 0024316
2015-09-24 06:29 Soruk42 Note Added: 0024433
2015-09-28 16:08 toracat Relationship added has duplicate 0009374
2015-09-28 16:27 toracat Note Added: 0024476
2015-09-28 16:31 toracat File Added: patch-i915-nomodeset-panic.patch
2015-09-28 16:33 toracat Note Edited: 0024476 View Revisions
2015-09-28 16:34 toracat Note Edited: 0024476 View Revisions
2015-11-30 10:16 Soruk42 Note Added: 0024942
2016-01-15 15:52 kstange Note Added: 0025371
2016-01-19 16:08 kstange Note Added: 0025415
2016-01-19 21:37 robdclark Note Added: 0025426
2016-01-20 07:35 toracat Note Added: 0025429
2016-01-20 07:38 toracat Status new => assigned
2016-01-22 08:52 T1loc File Added: IMG_20160122_094150.jpg
2016-01-22 08:54 T1loc Note Added: 0025463
2016-01-22 13:23 toracat Note Added: 0025465
2016-01-22 13:38 robdclark Note Added: 0025467
2016-01-22 13:41 T1loc Note Added: 0025468
2016-02-21 04:23 toracat Note Added: 0025792
2016-02-26 21:30 jeegee Note Added: 0025861
2016-04-11 11:23 dead_paulie Note Added: 0026253
2016-04-11 11:57 robdclark Note Added: 0026254
2016-05-26 20:48 Soruk42 Note Added: 0026703
2016-05-27 04:05 toracat Note Added: 0026704
2016-11-28 18:39 toracat Status assigned => resolved
2016-11-28 18:39 toracat Resolution open => fixed
2016-11-28 18:39 toracat Fixed in Version => 6.8
2016-11-28 18:39 toracat Note Added: 0028026