View Issue Details

IDProjectCategoryView StatusLast Update
0017598CentOS-8kernelpublic2020-10-02 17:16
Reporterps7776 
PrioritynormalSeveritymajorReproducibilityalways
Status newResolutionopen 
Product Version8.0.1905 
Target VersionFixed in Version 
Summary0017598: Any kernel beyond 8.0 rescue fails to boot
DescriptionHardware : SuperMicro MBD-H11SSL with AMD Epyc 7502 32-core CPU

Originally installed CentOS8.0.1905 on this system. / and /boot are on real GPT partitions, data disks are a combination of LVM and sodftware RAID-5.

Regular yum upgrades to running system since February all ok but a recent power outage lead to trying to boot from 8.2 kernel ( 193.6.3 ) which failed. After a lot of trial and error I determined that the only kernel that would boot properly was the original ( 147 ) rescue kernel and initramfs. For any other combination I get a black screen immediately after the Probing EDD .... message and the hardware eventually reboots. Removing quiet and rhgb , gives no extra output on the screen, rebuilding initramfs with either -a rescue or -H does not make any difference, adding rd.shell and/or rd.break to get a dracut shell doesn't do anything, ading init=/bin/bash doesn't do anything either,setting edd=on or off makes no difference. Removing and reinstalling kernels make no difference.

 So - basically the boot fails at a very, very early stage and I'm not sure how to get mode debug output. Any suggestions to other kernel options to try ? Like how to select the most basic video ? What is the dumbest VGA mode one can select ? Any early boot debug parameters ? Any other modules that I should try to load in to initramfs ?

This is obviously some weird combination of software and hardware ( BIOS ?) that causes booting to fail. Highly unexpected.

The BIOS is set to legacy ( ie. non UEFI only ) but I don't think there is anything wrong with with it or grub2 for that matter. I can do an "ls" at the grub prompt for all the partitions on all disks and they appear to be correct. Using (hd0,gpt... ) insted of labels or /dev/sda... makes no difference.


Booting from an 8.2 install USB fails too by the way.




peter
Steps To ReproduceBoot anything beyond the 8.0 rescue kernel.

TagsNo tags attached.

Activities

ps7776

ps7776

2020-07-18 00:32

reporter   ~0037377

Could be a video driver problem with the built in ASPEED card and the linux ast driver according to this link https://www.supermicro.com/support/faqs/faq.cfm?faq=31035 .
MissBlue

MissBlue

2020-10-01 23:52

reporter   ~0037776

Same issue with a Supermicro Model 1114S-WTRT paired with an AMD EPYC 7282.
CentOS 8.1 installs and runs fine, but the second it updates it will no longer boot, it also instantly black screens and performs a hardware reboot, only rescue mode works.
Doing a clean install with the latest 8.2 Minimal CentOS yields the same results, but it also breaks rescue mode.
MissBlue

MissBlue

2020-10-02 14:39

reporter   ~0037778

A quick update: It does not appear to be the kernel itself (At least in my case).
I can install Kernel 5.8 ML just fine on CentOS7 and CentOS 8.1 and it runs as expected.
The second you perform a 'yum/dnf' update, the system bricks, even on the 5.8 ML Kernel, only rescue mode will work, so it appears to be a package included in the update that causes this behaivor.
ps7776

ps7776

2020-10-02 15:44

reporter   ~0037779

Did the dnf/yum update nuke /boot/grub2/grub.cfg by any chance ? The "search" string in the ### BEGIN /etc/grub.d/10_linux ### section where it tries to guess the root device ? At one point mine ended up pointing to the wrong drive. Not sure why.

  By the way - did you try booting wih rd.debug rd.shell vga=0 single and no "rhbg" and "quiet" ? If it is due to a missing driver you can load them at boot time from a separate USB stick with the "dd" option. I had to do that on an old box since the SATA chipset on it is no longer supported in the default kernel ( it still is in the plus kernel )
MissBlue

MissBlue

2020-10-02 16:24

reporter   ~0037780

I'll have to check that and get back to you (Currently not home), I know that any paraments didn't work such as removing "quiet" and "rhbg" because the second it attempts to boot anything that isn't rescue mode, it just hardware reboots.
Not a single line nor a single message, just instantly hardware reboots, even with those parameters.
ps7776

ps7776

2020-10-02 17:16

reporter   ~0037781

Have you tried going to the command line in grub2 to search where it is trying to boot from Like something like (hd0,gpt1)/vm followed by a <tab> ? A bit of trial and error should tell you where grub2 thinks /boot is . Then compare with what is in grub2.cfg . Or try completing the "linux" and "initramfs" entries with rd.shell rd.debug rd.break and no quiet and rhgb and the partition where you found /boot . It should boot into a dracut shell with lots of logging enabled. You can add nomodeset and vga=0 as well to make it believe it has a dumb basic video card. This would bypass the grub2.cfg file completely and only require a good copy of vmlinuz and initramfs

Issue History

Date Modified Username Field Change
2020-07-17 18:32 ps7776 New Issue
2020-07-18 00:32 ps7776 Note Added: 0037377
2020-10-01 23:52 MissBlue Note Added: 0037776
2020-10-02 14:39 MissBlue Note Added: 0037778
2020-10-02 15:44 ps7776 Note Added: 0037779
2020-10-02 16:24 MissBlue Note Added: 0037780
2020-10-02 17:16 ps7776 Note Added: 0037781