View Issue Details

IDProjectCategoryView StatusLast Update
0014265CentOS-7kernelpublic2018-04-16 12:55
Reporterkenh 
PrioritynormalSeverityminorReproducibilityalways
Status newResolutionopen 
Platformx86_64OSOS Version
Product Version7.4.1708 
Target VersionFixed in Version 
Summary0014265: Crash when console serial port disconnected and after certain amount of runtime
DescriptionI have an APU2 from PCengines. It only has a serial port available to use as a console. When I have another machine connected to the serial port, it says up fine, for weeks. However, when I disconnect the serial port, after a while (the time is indeterminate; might be 15 minutes, might be a few days), without fail, the machine will crash. The message is:

[ 1300.768780] BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
[ 1300.776424] IP: [<ffffffff81410dc3>] uart_write_room+0x13/0x50
[ 1300.782318] PGD 0
[ 1300.784372] Oops: 0000 [#1] SMP
[ 1300.787667] Modules linked in: edac_mce_amd edac_core kvm_amd kvm irqbypass rc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helpe cryptd sg pcspkr i2c_piix4 fam15h_power k10temp ccp shpchp acpi_cpufreq ip_tabes xfs libcrc32c sd_mod crc_t10dif crct10dif_generic ahci igb libahci sdhci_pcilibata sdhci mmc_core ptp pps_core i2c_algo_bit i2c_core crct10dif_pclmul crct1dif_common crc32c_intel dca
[ 1300.826037] CPU: 3 PID: 2041 Comm: kworker/3:0 Not tainted 3.10.0-693.5.2.el.x86_64 #1
[ 1300.834045] Hardware name: PC Engines apu2/apu2, BIOS 88a4f96 03/07/2016
[ 1300.840763] Workqueue: events flush_to_ldisc
[ 1300.845074] task: ffff880118469fa0 ti: ffff880118b6c000 task.ti: ffff880118bc000
[ 1300.852559] RIP: 0010:[<ffffffff81410dc3>] [<ffffffff81410dc3>] uart_write_room+0x13/0x50
[ 1300.860865] RSP: 0018:ffff880118b6fc60 EFLAGS: 00010282
[ 1300.866186] RAX: 0000000000000800 RBX: 00000000000000ff RCX: ffff880118b6ffd
[ 1300.873345] RDX: ffffffff81410db0 RSI: ffff88011552bc00 RDI: ffff88011552940
[ 1300.880493] RBP: ffff880118b6fc70 R08: ffff88011806b400 R09: dffb32de1548c00
[ 1300.887644] R10: dffb32de1548c000 R11: 0000000000000001 R12: 000000000000000
[ 1300.894786] R13: ffff880115529400 R14: ffff88011552bc00 R15: 000000000000000
[ 1300.901942] FS: 00007faff69e6940(0000) GS:ffff88011ed80000(0000) knlGS:000000000000000
[ 1300.910032] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1300.915788] CR2: 0000000000000150 CR3: 00000000c1516000 CR4: 00000000000407e
[ 1300.922929] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 000000000000000
[ 1300.930071] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 000000000000040
[ 1300.937211] Stack:
[ 1300.939232] 00000000000000ff ffff88011552bc00 ffff880118b6fc80 ffffffff813fead
[ 1300.946738] ffff880118b6fcd8 ffffffff813f5e0e ffff88011552bec8 ffff88011552ea0
[ 1300.954236] ffff88011552bc00 ffff880118b6fcd8 00000000000000ff ffff88011552c00
[ 1300.961748] Call Trace:
[ 1300.964227] [<ffffffff813f8ead>] tty_write_room+0x1d/0x20
[ 1300.969742] [<ffffffff813f5e0e>] process_echoes+0x6e/0x2e0
[ 1300.975340] [<ffffffff813f7d0f>] n_tty_receive_char+0x18f/0xe30
[ 1300.981371] [<ffffffff810ca2ae>] ? account_entity_dequeue+0xae/0xd0
[ 1300.987743] [<ffffffff813f8b54>] n_tty_receive_buf+0x1a4/0x470
[ 1300.993690] [<ffffffff810ce55e>] ? dequeue_task_fair+0x41e/0x660
[ 1300.999799] [<ffffffff810cb63c>] ? set_next_entity+0x3c/0xe0
[ 1301.005573] [<ffffffff81029557>] ? __switch_to+0xd7/0x510
[ 1301.011074] [<ffffffff813fb849>] flush_to_ldisc+0x109/0x160
[ 1301.016754] [<ffffffff810a882a>] process_one_work+0x17a/0x440
[ 1301.022605] [<ffffffff810a94f6>] worker_thread+0x126/0x3c0
[ 1301.028193] [<ffffffff810a93d0>] ? manage_workers.isra.24+0x2a0/0x2a0
[ 1301.034754] [<ffffffff810b099f>] kthread+0xcf/0xe0
[ 1301.039651] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
[ 1301.045755] [<ffffffff816b4fd8>] ret_from_fork+0x58/0x90
[ 1301.051168] [<ffffffff810b08d0>] ? insert_kthread_work+0x40/0x40
[ 1301.057276] Code: 01 00 00 81 e3 ff 0f 00 00 e8 4a ab 29 00 89 d8 5b 41 5c 5 c3 0f 1f 00 66 66 66 66 90 55 48 89 e5 41 54 53 4c 8b a7 18 02 00 00 <49> 8b b 24 50 01 00 00 e8 90 ae 29 00 41 8b 9c 24 48 01 00 00
[ 1301.077815] RIP [<ffffffff81410dc3>] uart_write_room+0x13/0x50
[ 1301.083779] RSP <ffff880118b6fc60>
[ 1301.087278] CR2: 0000000000000150
Steps To ReproduceBoot machine with serial port attached; disconnect and let run for a while.
Additional InformationSo, I tried to dig into this some more. The address of the crash is uart_write_room+0x13, which is this routine (line 550, specifically):

545 {
546 struct uart_state *state = tty->driver_data;
547 unsigned long flags;
548 int ret;
549
550 spin_lock_irqsave(&state->uart_port->lock, flags);
551 ret = uart_circ_chars_free(&state->xmit);
552 spin_unlock_irqrestore(&state->uart_port->lock, flags);
553 return ret;
554 }

Disassembling that gives:

0xffffffff81410db0 <uart_write_room>: data32 data32 data32 xchg %ax,%ax [FTRACE NOP]
0xffffffff81410db5 <uart_write_room+5>: push %rbp
0xffffffff81410db6 <uart_write_room+6>: mov %rsp,%rbp
0xffffffff81410db9 <uart_write_room+9>: push %r12
0xffffffff81410dbb <uart_write_room+11>: push %rbx
/usr/src/debug/kernel-3.10.0-693.5.2.el7/linux-3.10.0-693.5.2.el7.x86_64/drivers/tty/serial/serial_core.c: 546
0xffffffff81410dbc <uart_write_room+12>: mov 0x218(%rdi),%r12
/usr/src/debug/kernel-3.10.0-693.5.2.el7/linux-3.10.0-693.5.2.el7.x86_64/drivers/tty/serial/serial_core.c: 550
0xffffffff81410dc3 <uart_write_room+19>: mov 0x150(%r12),%rdi

Alright, so clearly since the crash said NULL pointer dereference at 0000000000000150, and the crash report shows %12 as 0, that makes sense. But ... I can't figure out why that happens.

The single argument to that function, a struct tty_struct, is passed in RDI, which is ffff880115529400. "driver_data" is located at offset 0x218, so THAT corresponds to mov 0x218(%rdi),%r12. But ...

crash> struct tty_struct.driver_data ffff880115529400
  driver_data = 0xffff88011548c000

So R12 should be loaded with 0xffff88011548c000. But it is not; it's clearly set to NULL according to the register dump and the panic. So I'm not sure what is causing this.
TagsNo tags attached.
abrt_hash
URL

Activities

NetForces

NetForces

2018-04-16 12:55

reporter   ~0031620

Just got this same issue last night on a server running 7.4.1708 (kernel 3.10.0-693.11.6.el7.x86_64).

The system has a switch attached to it's serial port, but no service or code is attached to it. Looking at the switch logs it did not reboot or anything.

Here is the crash report if it helps:

reason: BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
component: kernel
count: 1
analyzer: vmcore
architecture: x86_64
event_log:
kernel: 3.10.0-693.11.6.el7.x86_64
kernel_tainted_short: GOE
last_occurrence: 1523840255
not-reportable: A kernel problem occurred, but your kernel has been tainted (flags:GOE). Kernel maintainers are unable to diagnose tainted reports.
os_release: CentOS Linux release 7.4.1708 (Core)
runlevel: unknown
time: Sun 15 Apr 2018 08:57:35 PM EDT
type: vmcore
uid: 0
username: root
uuid: d6f26972c7aada7491cc27b46c1091a4bc3a280b

backtrace:
:BUG: unable to handle kernel NULL pointer dereference at 0000000000000150
:IP: [<ffffffff81412e93>] uart_write_room+0x13/0x50
:PGD 0
:Oops: 0000 [#1] SMP
:Modules linked in: udp_diag tcp_diag inet_diag xt_comment sch_ingress act_mirred cls_u32 sch_hfsc ifb ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_REDIRECT nf_nat_redirect ipt_REJECT nf_reject_ipv4 xt_connlimit xt_conntrack nf_log_ipv4 nf_log_common xt_LOG xt_nat xt_multiport xt_DSCP xt_mark xt_set(OE) iptable_mangle 8021q garp mrp bridge stp llc nls_utf8 isofs fuse btrfs raid6_pq xor vfat msdos fat xfs iptable_nat nf_nat_ipv4 ip_set_hash_ip(OE) ip_set_hash_mac(OE) ip_set(OE) nfnetlink nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_h323 nf_conntrack_h323 nf_nat iptable_filter nf_conntrack_ipv4 nf_defrag_ipv4 nf_conntrack libcrc32c dm_mirror dm_region_hash dm_log dm_mod sr_mod cdrom intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul
: ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd joydev pcspkr sg i2c_i801 shpchp ipmi_si ipmi_devintf ipmi_msghandler video acpi_power_meter acpi_pad ip_tables ext4 mbcache jbd2 uas usb_storage sd_mod crc_t10dif crct10dif_generic drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm crct10dif_pclmul ahci crct10dif_common libahci crc32c_intel igb libata ptp pps_core dca i2c_algo_bit i2c_hid i2c_core
:CPU: 2 PID: 12731 Comm: kworker/2:0 Tainted: G OE ------------ 3.10.0-693.11.6.el7.x86_64 #1
:Hardware name: Seneca PRO591771/X11SSH-LN4F, BIOS 2.0 12/16/2016
:Workqueue: events flush_to_ldisc
:task: ffff880267b03f40 ti: ffff880261a0c000 task.ti: ffff880261a0c000
:RIP: 0010:[<ffffffff81412e93>] [<ffffffff81412e93>] uart_write_room+0x13/0x50
:RSP: 0018:ffff880261a0fc60 EFLAGS: 00010282
:RAX: 0000000000000800 RBX: 000000000000000a RCX: ffff880261a0ffd8
:RDX: ffffffff81412e80 RSI: ffff8800363d2800 RDI: ffff8800363d1c00
:RBP: ffff880261a0fc70 R08: ffff880266b84000 R09: dff5d5d46c8b8000
:R10: dff5d5d46c8b8000 R11: 0000000000000001 R12: 0000000000000000
:R13: ffff8800363d1c00 R14: ffff8800363d2800 R15: ffff880266b840a4
:FS: 0000000000000000(0000) GS:ffff880277900000(0000) knlGS:0000000000000000
:CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
:CR2: 0000000000000150 CR3: 00000002774a6000 CR4: 00000000003607e0
:DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
:DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
:Call Trace:
: [<ffffffff813faf7d>] tty_write_room+0x1d/0x20
: [<ffffffff813f7ede>] process_echoes+0x6e/0x2e0
: [<ffffffff816a9832>] ? mutex_lock+0x12/0x2f
: [<ffffffff813fa99c>] n_tty_receive_char+0xd4c/0xe30
: [<ffffffff810cf98c>] ? dequeue_entity+0x11c/0x5d0
: [<ffffffff813fac24>] n_tty_receive_buf+0x1a4/0x470
: [<ffffffff810d025e>] ? dequeue_task_fair+0x41e/0x660
: [<ffffffff8102954d>] ? __switch_to+0xcd/0x500
: [<ffffffff813fd919>] flush_to_ldisc+0x109/0x160
: [<ffffffff810aa3ba>] process_one_work+0x17a/0x440
: [<ffffffff810ab086>] worker_thread+0x126/0x3c0
: [<ffffffff810aaf60>] ? manage_workers.isra.24+0x2a0/0x2a0
: [<ffffffff810b252f>] kthread+0xcf/0xe0
: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
: [<ffffffff816b8798>] ret_from_fork+0x58/0x90
: [<ffffffff810b2460>] ? insert_kthread_work+0x40/0x40
:Code: 01 00 00 81 e3 ff 0f 00 00 e8 8a ac 29 00 89 d8 5b 41 5c 5d c3 0f 1f 00 0f 1f 44 00 00 55 48 89 e5 41 54 53 4c 8b a7 18 02 00 00 <49> 8b bc 24 50 01 00 00 e8 d0 af 29 00 41 8b 9c 24 48 01 00 00
:RIP [<ffffffff81412e93>] uart_write_room+0x13/0x50
: RSP <ffff880261a0fc60>
kernel_tainted_long:
:Proprietary module has not been loaded.
:Out-of-tree module has been loaded.
machineid:
:systemd=42f0e5daceb243b8a8e01b8e71c10828
:sosreport_uploader-dmidecode=9e111f66b1bf92391d162b84051aa2c0c57b95b1f98b3d908e6accd1babc39f1
os_info:
:NAME="CentOS Linux"
:VERSION="7 (Core)"
:ID="centos"
:ID_LIKE="rhel fedora"
:VERSION_ID="7"
:PRETTY_NAME="CentOS Linux 7 (Core)"
:ANSI_COLOR="0;31"
:CPE_NAME="cpe:/o:centos:centos:7"
:HOME_URL="https://www.centos.org/"
:BUG_REPORT_URL="https://bugs.centos.org/"
:
:CENTOS_MANTISBT_PROJECT="CentOS-7"
:CENTOS_MANTISBT_PROJECT_VERSION="7"
:REDHAT_SUPPORT_PRODUCT="centos"
:REDHAT_SUPPORT_PRODUCT_VERSION="7"

Issue History

Date Modified Username Field Change
2017-12-12 05:06 kenh New Issue
2018-04-16 12:55 NetForces Note Added: 0031620