View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0015358||CentOS-7||kernel||public||2018-10-06 15:28||2019-01-08 12:40|
|Target Version||Fixed in Version|
|Summary||0015358: Cannot boot with kernel series 3.10.0-862*, including latest 3.10.0-862.14.4.el7|
|Description||Currently have two Supermicro based servers with X9DRi-F motherboards:|
Both machines had BIOS Version: 1.0c. Upon installing any of the 3.10.0-862* seriel kernels, the machines will no longer be able to boot. I have to fall back on kernel: 3.10.0-693.21.1.el7.x86_64
On one reboot attempt, one motherboard died (could be a coincidence). We have since replaced the motherboard, which came with BIOS 3.2a, but have not tried a new boot attempt yet.
I understand there are similar bug reports, even pointint to RHEL, but they claim kernel-3.10.0-862.14.4.el7 fixed those issues, however, that does not seem to be the case for me (maybe a different issue).
Both machines are doing software raid and lvm on SATA attached ssds.
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdd2 sdc2
123983804 blocks super 1.1 [2/2] [UU]
bitmap: 1/1 pages [4KB], 65536KB chunk
md0 : active raid1 sdc1 sdd1
1048564 blocks super 1.0 [2/2] [UU]
bitmap: 0/1 pages [0KB], 65536KB chunk
[root@ysmha02 log]# vgdisplay ysmvg01
--- Volume group ---
VG Name ysmvg01
Metadata Areas 1
Metadata Sequence No 5
VG Access read/write
VG Status resizable
MAX LV 0
Cur LV 4
Open LV 3
Max PV 0
Cur PV 1
Act PV 1
VG Size 118.24 GiB
PE Size 4.00 MiB
Total PE 30269
Alloc PE / Size 21760 / 85.00 GiB
Free PE / Size 8509 / 33.24 GiB
VG UUID 8g21sC-F5pf-51Co-DgnA-yLiu-kfUl-AL1yco
[root@ysmha02 log]# mount | grep ysmvg01
/dev/mapper/ysmvg01-slash on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
/dev/mapper/ysmvg01-var on /var type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
Could you point me to how to collect crash information to be able to provide here?
|Steps To Reproduce||Update to the 3.10.0-862.14.4.el7 kernels and reboot.|
Machine will not boot anymore.
|Tags||No tags attached.|
This problem is still present with the latest CentOS release kernel: kernel-3.10.0-957.1.3.el7.x86_64
The last message I see on screen is
[ 2.506448] i8042: No controller found
At this point the machine hangs. The only way to get it to boot is to roll back to the older kernel, kernel-3.10.0-693.21.1.el7.x86_64
|You appear to have some sort of kmod installed that supplies a kernel module called arcmsr.ko for Areca RAID controllers. That copy links to a 7.1 series kernel in weak-updates. It's almost certain that a kernel module built to use weak updates from a 7.1 kernel will not function on the newer versions. Please supply the output from `lshw -short`|
The installed Areca kernel module is:
[root@ysmha02 ~]# rpm -qa |grep arcmsr
[root@ysmha02 ~]# rpm -qa |grep arcmsr
[root@ysmha02 ~]# yum info kmod-arcmsr
Loaded plugins: fastestmirror, verify
Loading mirror speeds from cached hostfile
* base: centos.vwtonline.net
* extras: centos.vwtonline.net
* updates: centos.vwtonline.net
Name : kmod-arcmsr
Arch : x86_64
Version : 1.30.0X.20_rhel7.1
Release : 1
Size : 593 k
Repo : installed
Summary : arcmsr kernel module(s)
URL : http://www.areca.com.tw
License : GPLv2
Description : This package provides the arcmsr kernel
: modules built for the Linux kernel
: 3.10.0-229.el7.x86_64 for the x86_64
: family of processors.
Not sure why I had to do it that way so long ago.
At some point the module was supposed to be included natively in the kernel, not sure if this was prior to that.
FWIW, the OS drives are not connected to the Areca controller, but I understand the module itself may be causing the computer to hang.
@TrevorH It seems like the module is indeed part of the latest kernels, just by looking at:
[root@ysmha02 arcmsr]# pwd
[root@ysmha02 arcmsr]# ls -l
-rw-r--r--. 1 root root 16424 Mar 7 2018 arcmsr.ko.xz
So my guess is that when I first installed these machines, either the module in RHEL 7/ CentOS 7 was not built into the kernel, or perhaps it did not quite yet supported my "too new at the time" controller, ARC-1882, which is the reason I had to install the Areca provided rpm.
I will remove the rpm module, install the latest kernel, 3.10.0-957.el7 and see if it works without it. Otherwise, I will just install the newest driver from Areca for 7.6 and report back.
OK, well, nope that is not it, the module included in the RHEL kernels is extremely outdated. Extracted that module, looked into it and running strings on it shows:
Hello! I am ARCMSR
Driver Version 1.20.00.15 2010/08/05
So that is extremely outdated. Downloading and installing:
which contains kmod-arcmsr-1.40.0X.10_rhel7.6-1.x86_64.rpm
Same problem in 3.10.0-957.1.3.el7.x86_64 in centos 1810
After update, reboot fail.
Driver ARCMSR is update in kmod-arcmsr-1.40.0X.10_rhel7.6-1.x86_64.rpm
In 3.10.0-957.el7.x86_64 no problem.
|Update : No problem on the last kernel 4.20|
I was finally able to reboot the servers and they boot properly with the latest ARECA driver, kmod-arcmsr-1.40.0X.10_rhel7.6-1.x86_64.rpm on both of the following kernels:
You may close this bug, given the issue was the outdated driver I had not updated.
|2018-10-06 15:email@example.com||New Issue|
|2018-12-15 02:firstname.lastname@example.org||Note Added: 0033346|
|2018-12-15 09:45||TrevorH||Note Added: 0033348|
|2018-12-15 14:email@example.com||Note Added: 0033352|
|2018-12-15 15:firstname.lastname@example.org||Note Added: 0033354|
|2018-12-15 16:email@example.com||Note Added: 0033355|
|2018-12-26 20:02||ophos||Note Added: 0033461|
|2018-12-27 09:03||ophos||Note Added: 0033462|
|2019-01-08 12:firstname.lastname@example.org||Note Added: 0033537|