View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0014779 | CentOS-7 | kernel | public | 2018-05-12 10:08 | 2018-10-30 21:54 |
Reporter | newton | Assigned To | |||
Priority | normal | Severity | crash | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Platform | Q1900B-ITX | OS | CentOS | OS Version | 7.5.1804 |
Product Version | 7.5.1804 | ||||
Summary | 0014779: BUG: unable to handle kernel NULL pointer dereference at (null) in snd-hdmi-lpe-audio | ||||
Description | System works without issues up to and including kernel 3.10.0-693.21.1.el7.x86_64, crash happens with 3.10.0-862.el7.x86_64 and 3.10.0-862.2.3.el7.x86_64. Sorry, this is a root server in a data center, I only had network kvm available (and even that was a good will action by the provider), so I only can provide a screenshot of the oops. Hope, that helps. | ||||
Steps To Reproduce | Boot 3.10.0-862.el7.x86_64 or 3.10.0-862.2.3.el7.x86_64 on affected hardware. | ||||
Tags | kerneloops | ||||
abrt_hash | |||||
URL | |||||
|
|
Kernel update to 3.10.0-862.2.3.el7 killed my HP Proliant G4... error indicates that the BIOS has corrupted the boot something or other. (did not get that recorded) I switched to the backup BIOS, and booted from previous kernel ( all is fine again running kernel 3.10.0-693.21.1.el7 ) |
|
Exact same issue with kernel-3.10.0-862.3.2.el7.x86_64 ( i had to roll back to kernel 3.10.0-693.21.1.el7 ) |
|
Same here. I was able to workaround and boot by adding: blacklist snd-soc-hdac-hdmi blacklist snd-hdmi-lpe-audio blacklist snd-hda-codec-hdmi to /etc/modprobe.d/snd.conf (the second line likely making a difference, I din't have time to narrow it down). After adding the above lines and booting to new kernel you can trigger the crash by: modprobe snd-hdmi-lpe-audio and get the crashdump. The end of vmcore-dmesg.txt has: [ 27.822264] traps: addconn[2542] trap stack segment ip:7f9de9e7262c sp:7ffc17807c10 error:0 in libc-2.17.so[7f9de9df2000+1c3000] [ 67.421336] input: Intel HDMI/DP LPE Audio HDMI/DP,pcm=0 as /devices/pci0000:00/0000:00:02.0/hdmi-lpe-audio/sound/card0/input6 [ 67.426297] input: Intel HDMI/DP LPE Audio HDMI/DP,pcm=1 as /devices/pci0000:00/0000:00:02.0/hdmi-lpe-audio/sound/card0/input7 [ 67.434041] BUG: unable to handle kernel NULL pointer dereference at (null) [ 67.438513] IP: [<ffffffffa03686ab>] __list_add+0x1b/0xc0 [ 67.442870] PGD 800000003f438067 PUD b30f1067 PMD 0 [ 67.447251] Oops: 0000 [#1] SMP [ 67.451589] Modules linked in: snd_hdmi_lpe_audio snd_hda_codec_hdmi snd_hda_codec snd_hda_core snd_hwdep drbg ansi_cprng rmd160 crypto_null ip_vti af_key ah6 ah4 esp6 esp4 xfrm4_mode_beet xfrm4_tunnel xfrm4_mode_tunnel xfrm4_mode_transport xfrm6_mode_transport xfrm6_mode_ro xfrm6_mode_beet xfrm6_mode_tunnel ipcomp ipcomp6 xfrm6_tunnel tunnel6 xfrm_ipcomp cmac camellia_generic camellia_x86_64 nf_log_ipv4 nf_log_common xt_LOG ip_set_hash_ip cast6_generic cast5_generic cast_common deflate cts gcm ccm serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common twofish_generic twofish_x86_64_3way xts twofish_x86_64 twofish_common xcbc sha512_ssse3 sha512_generic mcryptd des_generic lrw gf128mul glue_helper ablk_helper tun ip_gre gre 8021q garp mrp stp llc bonding nf_conntrack_ipv6 [ 67.467198] nf_defrag_ipv6 sit tunnel4 ip_tunnel ip6table_filter xt_TCPMSS xt_set ip6table_mangle ip6_tables nf_nat_sip nf_conntrack_sip nf_nat_irc nf_conntrack_irc nf_nat_ftp nf_conntrack_ftp xt_REDIRECT nf_nat_redirect xt_nat iptable_nat nf_nat_ipv4 nf_nat xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport xt_conntrack nf_conntrack libcrc32c iptable_filter ip_set_hash_netiface ip_set_hash_netport ip_set nfnetlink vfat fat intel_powerclamp coretemp intel_rapl kvm_intel kvm joydev iTCO_wdt ppdev irqbypass iTCO_vendor_support snd_soc_rt5670 crc32_pclmul snd_soc_rt5645 snd_intel_sst_acpi ghash_clmulni_intel snd_intel_sst_core snd_soc_rt5640 cryptd snd_soc_sst_atom_hifi2_platform snd_soc_rl6231 snd_soc_sst_match snd_soc_core sg hid_logitech_dj snd_compress pcspkr snd_pcm lpc_ich shpchp i2c_i801 [ 67.484440] parport_pc snd_timer parport snd soundcore regmap_i2c i2c_designware_platform i2c_designware_core pwm_lpss auth_rpcgss sunrpc ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic i915 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci drm libahci e1000e libata crct10dif_pclmul crct10dif_common crc32c_intel serio_raw ptp pps_core sdhci_acpi sdhci mmc_core video i2c_hid i2c_core iosf_mbi dm_mirror dm_region_hash dm_log dm_mod [ 67.503184] CPU: 2 PID: 59 Comm: kworker/2:1 Kdump: loaded Not tainted 3.10.0-862.3.2.el7.x86_64 #1 [ 67.509635] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/09/2016 [ 67.516138] Workqueue: events had_audio_wq [snd_hdmi_lpe_audio] [ 67.522634] task: ffff947034920fd0 ti: ffff947034954000 task.ti: ffff947034954000 [ 67.529167] RIP: 0010:[<ffffffffa03686ab>] [<ffffffffa03686ab>] __list_add+0x1b/0xc0 [ 67.535739] RSP: 0018:ffff947034957d48 EFLAGS: 00010246 [ 67.542273] RAX: 00000000ffffffff RBX: ffff947034957d70 RCX: 0000000000000000 [ 67.548831] RDX: ffff9470316fc908 RSI: 0000000000000000 RDI: ffff947034957d70 [ 67.555265] RBP: ffff947034957d60 R08: 0000000000000000 R09: ae3eceb10defc8e0 [ 67.561484] R10: ae3eceb10defc8e0 R11: 0000000000000001 R12: ffff9470316fc908 [ 67.567577] R13: 0000000000000000 R14: 00000000ffffffff R15: ffff9470316fc908 [ 67.573620] FS: 0000000000000000(0000) GS:ffff94703fd00000(0000) knlGS:0000000000000000 [ 67.579746] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 67.585865] CR2: 0000000000000000 CR3: 00000000b304a000 CR4: 00000000001007e0 [ 67.592060] Call Trace: [ 67.598274] [<ffffffffa0712c36>] __mutex_lock_slowpath+0xa6/0x1d0 [ 67.604598] [<ffffffffa071203f>] mutex_lock+0x1f/0x2f [ 67.610936] [<ffffffffc0d1736c>] had_audio_wq+0x5c/0x738 [snd_hdmi_lpe_audio] [ 67.617343] [<ffffffffa00b312f>] process_one_work+0x17f/0x440 [ 67.623791] [<ffffffffa00b3df6>] worker_thread+0x126/0x3c0 [ 67.630276] [<ffffffffa00b3cd0>] ? manage_workers.isra.24+0x2a0/0x2a0 [ 67.636801] [<ffffffffa00bb161>] kthread+0xd1/0xe0 [ 67.643339] [<ffffffffa00bb090>] ? insert_kthread_work+0x40/0x40 [ 67.649904] [<ffffffffa0720677>] ret_from_fork_nospec_begin+0x21/0x21 [ 67.656520] [<ffffffffa00bb090>] ? insert_kthread_work+0x40/0x40 [ 67.663001] Code: ff ff ff 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 55 48 89 e5 41 55 49 89 f5 41 54 49 89 d4 53 4c 8b 42 08 48 89 fb 49 39 f0 75 2a <4d> 8b 45 00 4d 39 c4 75 68 4c 39 e3 74 3e 4c 39 eb 74 39 49 89 [ 67.670173] RIP [<ffffffffa03686ab>] __list_add+0x1b/0xc0 [ 67.676876] RSP <ffff947034957d48> [ 67.683552] CR2: 0000000000000000 I have the crashdump and can send it for analysis to specific person (preferably encrypted) but I'd rather not post it for public viewing... |
|
and... after recent Microcode update, now have random crashes with 3.10.0-693.21.1.el7 as well. Rolled back to "emergency kernel" 3.10.0.327 ... |
|
I'm seeing this as well. | |
Can someone do a test-install of ELRepo's kernel-ml [1] ? The current version is kernel-ml-4.17.0-1.el7.elrepo. This is to find out if the issue reported in this tracker has been fixed in the latest mainline kernel. [1] https://elrepo.org/tiki/kernel-ml |
|
I'm currently running 3.10.0-862.3.2.el7.x86_64 and I have several snd modules blacklisted. When I "modprobe snd-hdmi-lpe-audio" with this kernel the machine crashes. However, the kernel-el kernel 4.17.0-1 does not have the module snd-hdmi-lpe-audio and if I boot that kernel without any blacklisted snd modules no snd modules get autoloaded. So, I doubt this kernel has the relevant options enabled which, of course, would render this test useless. |
|
Hmm, you are right. kernel-ml does not have this module. Will see if it can be enabled. | |
I have rebuilt kernel-ml with CONFIG_HDMI_LPE_AUDIO=m (kernel-ml-4.17.0-1.ay1.el7.x86_64.rpm) and uploaded it to: http://elrepo.org/people/akemi/testing/el7/kernel/ Please note that the packages are not signed. |
|
I can confirm that I, too, am having this problem, with 3.10.0-862.3.2.el7.x86_64, following two fresh installs on new hardware yesterday. I'm sorry all the evidence I have is a photo. Going back to the install kernel (3.10.0-693.el7.x86_64) is a workaround. I have installed kernel-ml-4.17.0-1.ay1.el7.x86_64.rpm as requested and can confirm the issue is not present there, at least for me. |
|
@madhatta Thanks for reporting the result with kernel-ml. Good to know that the fix is in the latest mainline kernel. Hopefully we can identify the patch that takes care of the current problem. |
|
The following upstream patch (commit c77a6edb6d4d35204673cad7389c317bfb17492e ) is a likely candidate that fixes the issue: https://patchwork.kernel.org/patch/10246971/ I built a kernel-plus package using the above patch ( kernel-plus-3.10.0-862.3.2.el7.bug14779.centos.plus.x86_64.rpm ). It is available for testing: https://people.centos.org/toracat/kernel/7/plus/bug14779/ |
|
Unfortunately, the decision was taken to put C6 on the hardware I was working on, for delivery-date reasons. I am expecting more of the same hardware shortly, and will attempt to continue testing and feedback at that time. | |
[root@tux ~]# uname -a Linux tux.leun.net 3.10.0-862.3.2.el7.bug14779.centos.plus.x86_64 #1 SMP Tue Jun 12 22:32:06 PDT 2018 x86_64 x86_64 x86_64 GNU/Linux [root@tux ~]# modprobe snd-hdmi-lpe-audio System survived that - crashes without the fix. So, yup, can confirm. Thanks. |
|
@newton Thanks for the test result and the confirmation that the patch worked. The next official update to kernel-plus will include the patch. Now, the next step is to file a bug report with Red Hat at http://bugzilla.redhat.com to get this patch into the RHEL kernel. Then the CentOS kernel will inherit it. |
|
kernel-plus-3.10.0-862.3.3.el7.centos.plus has been released. It has the patch from this bug report. | |
Well, this didn't solve it for me. (latest 3.10.0-862.3.3.el7 kernel crashes my system still). I guess I'll have to figure out how to get the crashlog and make a new bug report... | |
You need the centosplus kernel not the main distro one. | |
Found and installed centosplus kernel, but it too crashes on boot. My system must have a slightly different bug than the one fixed... |
|
@mikerotec If the provided plus kernel does not fix the crash, your problem is most likely different. You might want to file a new bug report with all the details. |
|
RHBZ status: Originally opened as #1598592 (private) but closed as a duplicate of #1551742 (private). Currently marked "Verified". |
|
I am planning to build a new test version of kernel-plus that might fix other alsa-related issues. | |
The following two patches have been added: commit 1967158fff819b38f4e46763ca8df067b4b69f59 "ALSA: x86: fix error return code in hdmi_lpe_audio_probe()" commit 7229b12f5da33d5c376ee264f063703844b8092d "ALSA: x86: hdmi: Add single_port option for compatible behavior" A patched set of kernel-plus is available for testing: https://people.centos.org/toracat/kernel/7/plus/bug14779_2/ (kernel-plus-3.10.0-862.14.4.el7.centos.plus.6) Feedback appreciated. |
|
RHEL 7.6 is out. The distro kernel (3.10.0-957.el7) now has all three patches in this bug report. Closing as 'resolved'. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2018-05-12 10:08 | newton | New Issue | |
2018-05-12 10:08 | newton | File Added: Screenshot_20180512_110722a.png | |
2018-05-12 10:08 | newton | Tag Attached: kerneloops | |
2018-05-15 23:26 | mikerotec | Note Added: 0031835 | |
2018-05-22 20:54 | mikerotec | Note Added: 0031893 | |
2018-05-24 17:08 | tomkep | Note Added: 0031912 | |
2018-06-04 20:39 | mikerotec | Note Added: 0032006 | |
2018-06-06 16:14 | jsmith | Note Added: 0032023 | |
2018-06-10 15:58 | toracat | Note Added: 0032050 | |
2018-06-11 22:54 | toracat | Status | new => feedback |
2018-06-11 23:28 | newton | Note Added: 0032058 | |
2018-06-11 23:28 | newton | Status | feedback => assigned |
2018-06-12 00:39 | toracat | Note Added: 0032059 | |
2018-06-12 04:59 | toracat | Note Added: 0032060 | |
2018-06-12 07:52 | madhatta | File Added: IMG_4136.JPG | |
2018-06-12 07:52 | madhatta | Note Added: 0032062 | |
2018-06-12 14:40 | toracat | Note Added: 0032067 | |
2018-06-13 06:00 | toracat | Note Added: 0032074 | |
2018-06-13 06:58 | madhatta | Note Added: 0032075 | |
2018-06-13 07:19 | newton | Note Added: 0032076 | |
2018-06-13 16:16 | toracat | Note Added: 0032079 | |
2018-06-16 05:50 | toracat | Note Added: 0032099 | |
2018-06-18 22:59 | mikerotec | Note Added: 0032108 | |
2018-06-18 23:12 | TrevorH | Note Added: 0032109 | |
2018-06-19 00:03 | mikerotec | File Added: 20180618_164727_001.jpg | |
2018-06-19 00:03 | mikerotec | File Added: 01_20180618_164347.jpg | |
2018-06-19 00:03 | mikerotec | File Added: 02_20180618_164450.jpg | |
2018-06-19 00:03 | mikerotec | Note Added: 0032110 | |
2018-07-05 18:35 | toracat | Note Added: 0032183 | |
2018-08-10 17:23 | toracat | Note Added: 0032461 | |
2018-10-10 15:15 | toracat | Note Added: 0032902 | |
2018-10-13 01:03 | toracat | Note Added: 0032918 | |
2018-10-30 21:54 | toracat | Status | assigned => resolved |
2018-10-30 21:54 | toracat | Resolution | open => fixed |
2018-10-30 21:54 | toracat | Note Added: 0033023 |