View Issue Details

IDProjectCategoryView StatusLast Update
0016367CentOS-7kernelpublic2019-08-29 10:50
Reporterphloks 
PrioritynormalSeveritycrashReproducibilityrandom
Status newResolutionopen 
PlatformX86_64OSCentOSOS Version7.6.1810
Product Version7.6.1810 
Target VersionFixed in Version 
Summary0016367: Kernel crashes but not clear why
DescriptionOn one of our machines we have had a couple of kernel panics now. We cannot reproduce the error, so all I have is some kernel dump logs.
Can anyone please tell us why this (particular event) happens ?
I see three processes (dotnet and 2x swapper), but which one is the culprit here ?
BTW, kernel version = 3.10.0-957.21.3
------------------------------------------------------------------------------------------
crash ./usr/lib/debug/lib/modules/3.10.0-957.21.3.el7.x86_64/vmlinux vmcore

crash 7.2.3-8.el7
Copyright (C) 2002-2017 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

WARNING: kernel relocated [596MB]: patching 85670 gdb minimal_symbol values

      KERNEL: ./usr/lib/debug/lib/modules/3.10.0-957.21.3.el7.x86_64/vmlinux
    DUMPFILE: vmcore [PARTIAL DUMP]
        CPUS: 3
        DATE: Tue Aug 27 17:28:12 2019
      UPTIME: 39 days, 08:49:22
LOAD AVERAGE: 1.85, 2.62, 2.90
       TASKS: 1728
    NODENAME: ***********************
     RELEASE: 3.10.0-957.21.3.el7.x86_64
     VERSION: #1 SMP Tue Jun 18 16:35:19 UTC 2019
     MACHINE: x86_64 (2599 Mhz)
      MEMORY: 16 GB
       PANIC: "BUG: unable to handle kernel NULL pointer dereference at 0000000000000018"
         PID: 18609
     COMMAND: "dotnet"
        TASK: ffff8d5f2822a080 [THREAD_INFO: ffff8d5d4cb2c000]
         CPU: 0
       STATE: TASK_RUNNING (PANIC)

crash> bt
PID: 18609 TASK: ffff8d5f2822a080 CPU: 0 COMMAND: "dotnet"
 #0 [ffff8d5f6d6039c0] machine_kexec at ffffffffa6463934
 #1 [ffff8d5f6d603a20] __crash_kexec at ffffffffa651d162
 #2 [ffff8d5f6d603af0] crash_kexec at ffffffffa651d250
 #3 [ffff8d5f6d603b08] oops_end at ffffffffa6b6d778
 #4 [ffff8d5f6d603b30] no_context at ffffffffa6b5bdbe
 #5 [ffff8d5f6d603b80] __bad_area_nosemaphore at ffffffffa6b5be55
 #6 [ffff8d5f6d603bd0] bad_area_nosemaphore at ffffffffa6b5bfc6
 #7 [ffff8d5f6d603be0] __do_page_fault at ffffffffa6b706d0
 #8 [ffff8d5f6d603c50] do_page_fault at ffffffffa6b70925
 #9 [ffff8d5f6d603c80] page_fault at ffffffffa6b6c768
    [exception RIP: swiotlb_unmap_sg_attrs+40]
    RIP: ffffffffa67a1188 RSP: ffff8d5f6d603d30 RFLAGS: 00010093
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
    RDX: 0000000000001000 RSI: 00000001a94f7000 RDI: ffff8d5d30f13580
    RBP: ffff8d5f6d603d58 R8: 0000000000000000 R9: ffffffffa67a1160
    R10: ffff8d5f6a343760 R11: 00007fb6f8d5e000 R12: 000000000000000d
    R13: 000000000000000f R14: 0000000000000002 R15: ffff8d5f6f56f098
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#10 [ffff8d5f6d603d60] scsi_dma_unmap at ffffffffa68dc051
#11 [ffff8d5f6d603d70] mptscsih_io_done at ffffffffc035944a [mptscsih]
#12 [ffff8d5f6d603de8] mpt_interrupt at ffffffffc02a6ca4 [mptbase]
#13 [ffff8d5f6d603eb0] __handle_irq_event_percpu at ffffffffa654a7f4
#14 [ffff8d5f6d603ef8] handle_irq_event_percpu at ffffffffa654a9a2
#15 [ffff8d5f6d603f28] handle_irq_event at ffffffffa654aa2c
#16 [ffff8d5f6d603f50] handle_fasteoi_irq at ffffffffa654e089
#17 [ffff8d5f6d603f70] handle_irq at ffffffffa642e554
#18 [ffff8d5f6d603fb8] do_IRQ at ffffffffa6b7a5fd
--- <IRQ stack> ---
#19 [ffff8d5d4cb2ff58] ret_from_intr at ffffffffa6b6c362
    RIP: 00007fb989e1f8dd RSP: 00007fb8f7f9e558 RFLAGS: 00000246
    RAX: 0000000000008eac RBX: 000000000001f0be RCX: 00007fb9123c1300
    RDX: 00007fb7fd089720 RSI: 0000000000008eab RDI: 00007fb6fb523b30
    RBP: 00007fb8f7f9e5b0 R8: 00007fb710d60000 R9: 0000000000000003
    R10: 00007fb9126cd1a8 R11: 00007fb6f8d5e000 R12: ffff8d5f6d6061e8
    R13: 00007fb78d894118 R14: 00007fb91331b000 R15: 00007fb91331b0d0
    ORIG_RAX: ffffffffffffffbb CS: 0033 SS: 002b
crash> bt -a
PID: 18609 TASK: ffff8d5f2822a080 CPU: 0 COMMAND: "dotnet"
 #0 [ffff8d5f6d6039c0] machine_kexec at ffffffffa6463934
 #1 [ffff8d5f6d603a20] __crash_kexec at ffffffffa651d162
 #2 [ffff8d5f6d603af0] crash_kexec at ffffffffa651d250
 #3 [ffff8d5f6d603b08] oops_end at ffffffffa6b6d778
 #4 [ffff8d5f6d603b30] no_context at ffffffffa6b5bdbe
 #5 [ffff8d5f6d603b80] __bad_area_nosemaphore at ffffffffa6b5be55
 #6 [ffff8d5f6d603bd0] bad_area_nosemaphore at ffffffffa6b5bfc6
 #7 [ffff8d5f6d603be0] __do_page_fault at ffffffffa6b706d0
 #8 [ffff8d5f6d603c50] do_page_fault at ffffffffa6b70925
 #9 [ffff8d5f6d603c80] page_fault at ffffffffa6b6c768
    [exception RIP: swiotlb_unmap_sg_attrs+40]
    RIP: ffffffffa67a1188 RSP: ffff8d5f6d603d30 RFLAGS: 00010093
    RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000002
    RDX: 0000000000001000 RSI: 00000001a94f7000 RDI: ffff8d5d30f13580
    RBP: ffff8d5f6d603d58 R8: 0000000000000000 R9: ffffffffa67a1160
    R10: ffff8d5f6a343760 R11: 00007fb6f8d5e000 R12: 000000000000000d
    R13: 000000000000000f R14: 0000000000000002 R15: ffff8d5f6f56f098
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000
#10 [ffff8d5f6d603d60] scsi_dma_unmap at ffffffffa68dc051
#11 [ffff8d5f6d603d70] mptscsih_io_done at ffffffffc035944a [mptscsih]
#12 [ffff8d5f6d603de8] mpt_interrupt at ffffffffc02a6ca4 [mptbase]
#13 [ffff8d5f6d603eb0] __handle_irq_event_percpu at ffffffffa654a7f4
#14 [ffff8d5f6d603ef8] handle_irq_event_percpu at ffffffffa654a9a2
#15 [ffff8d5f6d603f28] handle_irq_event at ffffffffa654aa2c
#16 [ffff8d5f6d603f50] handle_fasteoi_irq at ffffffffa654e089
#17 [ffff8d5f6d603f70] handle_irq at ffffffffa642e554
#18 [ffff8d5f6d603fb8] do_IRQ at ffffffffa6b7a5fd
--- <IRQ stack> ---
#19 [ffff8d5d4cb2ff58] ret_from_intr at ffffffffa6b6c362
    RIP: 00007fb989e1f8dd RSP: 00007fb8f7f9e558 RFLAGS: 00000246
    RAX: 0000000000008eac RBX: 000000000001f0be RCX: 00007fb9123c1300
    RDX: 00007fb7fd089720 RSI: 0000000000008eab RDI: 00007fb6fb523b30
    RBP: 00007fb8f7f9e5b0 R8: 00007fb710d60000 R9: 0000000000000003
    R10: 00007fb9126cd1a8 R11: 00007fb6f8d5e000 R12: ffff8d5f6d6061e8
    R13: 00007fb78d894118 R14: 00007fb91331b000 R15: 00007fb91331b0d0
    ORIG_RAX: ffffffffffffffbb CS: 0033 SS: 002b

PID: 0 TASK: ffff8d5cb9ca4100 CPU: 1 COMMAND: "swapper/1"
 #0 [ffff8d5f6d648e48] crash_nmi_callback at ffffffffa6455f97
 #1 [ffff8d5f6d648e58] nmi_handle at ffffffffa6b6d91c
 #2 [ffff8d5f6d648eb0] do_nmi at ffffffffa6b6db3d
 #3 [ffff8d5f6d648ef0] end_repeat_nmi at ffffffffa6b6cd89
    [exception RIP: native_safe_halt+11]
    RIP: ffffffffa6b6af5b RSP: ffff8d5cb9d17ea8 RFLAGS: 00000246
    RAX: ffffffffa6b6ad30 RBX: ffffffffa7158720 RCX: 0100000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
    RBP: ffff8d5cb9d17ea8 R8: 0000000000000000 R9: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000001
    R13: ffff8d5cb9d14000 R14: ffff8d5cb9d14000 R15: ffff8d5cb9d14000
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
 #4 [ffff8d5cb9d17ea8] native_safe_halt at ffffffffa6b6af5b
 #5 [ffff8d5cb9d17eb0] default_idle at ffffffffa6b6ad4e
 #6 [ffff8d5cb9d17ed0] arch_cpu_idle at ffffffffa64366f0
 #7 [ffff8d5cb9d17ee0] cpu_startup_entry at ffffffffa64fc6da
 #8 [ffff8d5cb9d17f28] start_secondary at ffffffffa6458047
 #9 [ffff8d5cb9d17f50] start_cpu at ffffffffa64000d5

PID: 0 TASK: ffff8d5cb9ca5140 CPU: 2 COMMAND: "swapper/2"
 #0 [ffff8d5f6d688e48] crash_nmi_callback at ffffffffa6455f97
 #1 [ffff8d5f6d688e58] nmi_handle at ffffffffa6b6d91c
 #2 [ffff8d5f6d688eb0] do_nmi at ffffffffa6b6db3d
 #3 [ffff8d5f6d688ef0] end_repeat_nmi at ffffffffa6b6cd89
    [exception RIP: native_safe_halt+11]
    RIP: ffffffffa6b6af5b RSP: ffff8d5cb9d1bea8 RFLAGS: 00000246
    RAX: ffffffffa6b6ad30 RBX: ffffffffa7158720 RCX: 0100000000000000
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000046
    RBP: ffff8d5cb9d1bea8 R8: 0000000000000000 R9: 0000000000000001
    R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002
    R13: ffff8d5cb9d18000 R14: ffff8d5cb9d18000 R15: ffff8d5cb9d18000
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
--- <NMI exception stack> ---
 #4 [ffff8d5cb9d1bea8] native_safe_halt at ffffffffa6b6af5b
 #5 [ffff8d5cb9d1beb0] default_idle at ffffffffa6b6ad4e
 #6 [ffff8d5cb9d1bed0] arch_cpu_idle at ffffffffa64366f0
 #7 [ffff8d5cb9d1bee0] cpu_startup_entry at ffffffffa64fc6da
 #8 [ffff8d5cb9d1bf28] start_secondary at ffffffffa6458047
 #9 [ffff8d5cb9d1bf50] start_cpu at ffffffffa64000d5
crash> ps -A
   PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 0 0 1 ffff8d5cb9ca4100 RU 0.0 0 0 [swapper/1]
> 0 0 2 ffff8d5cb9ca5140 RU 0.0 0 0 [swapper/2]
> 18609 8033 0 ffff8d5f2822a080 RU 30.4 10729424 5425104 dotnet
crash> ps 18609
   PID PPID CPU TASK ST %MEM VSZ RSS COMM
> 18609 8033 0 ffff8d5f2822a080 RU 30.4 10729424 5425104 dotnet
crash> ps -a 18609
PID: 18609 TASK: ffff8d5f2822a080 CPU: 0 COMMAND: "dotnet"
ps: cannot access user stack address: 7fffeb212a28
crash> quit

Steps To ReproduceNot possible
TagsNo tags attached.
abrt_hash
URL

Activities

TrevorH

TrevorH

2019-08-29 10:50

manager   ~0035031

I would suggest updating to the latest kernel - currently 3.10.0-957.27.2.el7 though 7.7 is in the works and has 3.10.0-1062.el7 and includes a rebase of the mpt3sas driver. Only the current version is supported.

Issue History

Date Modified Username Field Change
2019-08-29 10:42 phloks New Issue
2019-08-29 10:50 TrevorH Note Added: 0035031