View Issue Details

IDProjectCategoryView StatusLast Update
0015358CentOS-7kernelpublic2018-12-15 16:07
Reporterdijuremo@gmail.com 
PriorityhighSeveritycrashReproducibilityalways
Status newResolutionopen 
Product Version7.5.1804 
Target VersionFixed in Version 
Summary0015358: Cannot boot with kernel series 3.10.0-862*, including latest 3.10.0-862.14.4.el7
DescriptionCurrently have two Supermicro based servers with X9DRi-F motherboards:

https://www.supermicro.com/products/motherboard/Xeon/C600/X9DRi-F.cfm

Both machines had BIOS Version: 1.0c. Upon installing any of the 3.10.0-862* seriel kernels, the machines will no longer be able to boot. I have to fall back on kernel: 3.10.0-693.21.1.el7.x86_64

On one reboot attempt, one motherboard died (could be a coincidence). We have since replaced the motherboard, which came with BIOS 3.2a, but have not tried a new boot attempt yet.

I understand there are similar bug reports, even pointint to RHEL, but they claim kernel-3.10.0-862.14.4.el7 fixed those issues, however, that does not seem to be the case for me (maybe a different issue).

Both machines are doing software raid and lvm on SATA attached ssds.
# cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdd2[0] sdc2[1]
      123983804 blocks super 1.1 [2/2] [UU]
      bitmap: 1/1 pages [4KB], 65536KB chunk

md0 : active raid1 sdc1[1] sdd1[0]
      1048564 blocks super 1.0 [2/2] [UU]
      bitmap: 0/1 pages [0KB], 65536KB chunk

[root@ysmha02 log]# vgdisplay ysmvg01
  --- Volume group ---
  VG Name ysmvg01
  System ID
  Format lvm2
  Metadata Areas 1
  Metadata Sequence No 5
  VG Access read/write
  VG Status resizable
  MAX LV 0
  Cur LV 4
  Open LV 3
  Max PV 0
  Cur PV 1
  Act PV 1
  VG Size 118.24 GiB
  PE Size 4.00 MiB
  Total PE 30269
  Alloc PE / Size 21760 / 85.00 GiB
  Free PE / Size 8509 / 33.24 GiB
  VG UUID 8g21sC-F5pf-51Co-DgnA-yLiu-kfUl-AL1yco

[root@ysmha02 log]# mount | grep ysmvg01
/dev/mapper/ysmvg01-slash on / type xfs (rw,relatime,seclabel,attr2,inode64,noquota)
/dev/mapper/ysmvg01-var on /var type xfs (rw,relatime,seclabel,attr2,inode64,noquota)

Could you point me to how to collect crash information to be able to provide here?
Steps To ReproduceUpdate to the 3.10.0-862.14.4.el7 kernels and reboot.

Machine will not boot anymore.
TagsNo tags attached.
abrt_hash
URL

Activities

dijuremo@gmail.com

dijuremo@gmail.com

2018-12-15 02:24

reporter   ~0033346

This problem is still present with the latest CentOS release kernel: kernel-3.10.0-957.1.3.el7.x86_64

The last message I see on screen is
[ 2.506448] i8042: No controller found

At this point the machine hangs. The only way to get it to boot is to roll back to the older kernel, kernel-3.10.0-693.21.1.el7.x86_64
TrevorH

TrevorH

2018-12-15 09:45

manager   ~0033348

You appear to have some sort of kmod installed that supplies a kernel module called arcmsr.ko for Areca RAID controllers. That copy links to a 7.1 series kernel in weak-updates. It's almost certain that a kernel module built to use weak updates from a 7.1 kernel will not function on the newer versions. Please supply the output from `lshw -short`
dijuremo@gmail.com

dijuremo@gmail.com

2018-12-15 14:37

reporter   ~0033352

http://termbin.com/fv6f

The installed Areca kernel module is:

[root@ysmha02 ~]# rpm -qa |grep arcmsr
kmod-arcmsr-1.30.0X.20_rhel7.1-1.x86_64

[root@ysmha02 ~]# rpm -qa |grep arcmsr
kmod-arcmsr-1.30.0X.20_rhel7.1-1.x86_64
[root@ysmha02 ~]# yum info kmod-arcmsr
Loaded plugins: fastestmirror, verify
Loading mirror speeds from cached hostfile
 * base: centos.vwtonline.net
 * extras: centos.vwtonline.net
 * updates: centos.vwtonline.net
Installed Packages
Name : kmod-arcmsr
Arch : x86_64
Version : 1.30.0X.20_rhel7.1
Release : 1
Size : 593 k
Repo : installed
Summary : arcmsr kernel module(s)
URL : http://www.areca.com.tw
License : GPLv2
Description : This package provides the arcmsr kernel
            : modules built for the Linux kernel
            : 3.10.0-229.el7.x86_64 for the x86_64
            : family of processors.

Not sure why I had to do it that way so long ago.

At some point the module was supposed to be included natively in the kernel, not sure if this was prior to that.

FWIW, the OS drives are not connected to the Areca controller, but I understand the module itself may be causing the computer to hang.
dijuremo@gmail.com

dijuremo@gmail.com

2018-12-15 15:52

reporter   ~0033354

@TrevorH It seems like the module is indeed part of the latest kernels, just by looking at:

[root@ysmha02 arcmsr]# pwd
/usr/lib/modules/3.10.0-693.21.1.el7.x86_64/kernel/drivers/scsi/arcmsr
[root@ysmha02 arcmsr]# ls -l
total 20
-rw-r--r--. 1 root root 16424 Mar 7 2018 arcmsr.ko.xz

So my guess is that when I first installed these machines, either the module in RHEL 7/ CentOS 7 was not built into the kernel, or perhaps it did not quite yet supported my "too new at the time" controller, ARC-1882, which is the reason I had to install the Areca provided rpm.

I will remove the rpm module, install the latest kernel, 3.10.0-957.el7 and see if it works without it. Otherwise, I will just install the newest driver from Areca for 7.6 and report back.
dijuremo@gmail.com

dijuremo@gmail.com

2018-12-15 16:07

reporter   ~0033355

OK, well, nope that is not it, the module included in the RHEL kernels is extremely outdated. Extracted that module, looked into it and running strings on it shows:

Hello! I am ARCMSR
Driver Version 1.20.00.15 2010/08/05

So that is extremely outdated. Downloading and installing:

http://www.areca.us/support/s_linux/driver/rhel/7_6.zip

which contains kmod-arcmsr-1.40.0X.10_rhel7.6-1.x86_64.rpm

Issue History

Date Modified Username Field Change
2018-10-06 15:28 dijuremo@gmail.com New Issue
2018-12-15 02:24 dijuremo@gmail.com Note Added: 0033346
2018-12-15 09:45 TrevorH Note Added: 0033348
2018-12-15 14:37 dijuremo@gmail.com Note Added: 0033352
2018-12-15 15:52 dijuremo@gmail.com Note Added: 0033354
2018-12-15 16:07 dijuremo@gmail.com Note Added: 0033355