View Issue Details

IDProjectCategoryView StatusLast Update
0016954CentOS-7kernelpublic2020-01-22 02:13
Reporterchengwei 
PrioritynormalSeveritycrashReproducibilitysometimes
Status newResolutionopen 
Product Version7.6.1810 
Target VersionFixed in Version 
Summary0016954: BUG: unable to handle kernel paging request at 0000006e6f697403 with RIP update_blocked_averages+0x8f/0x700
DescriptionWe run centos 7.6 with kernel 3.10.0-957.el7.x86_64, yes, this is not the latest 957 kernel in centos 7.6, which crash sometimes and I captured the kernel dump and find that is crash at below line of code

kernel/sched/fair.c:5402

```
    struct sched_entity *se = tg->se[cpu];
```

below are dmesgs related

```
[ 2303.244233] BUG: unable to handle kernel paging request at 0000006e6f697403
[ 2303.244267] IP: [<ffffffff8b6dd56f>] update_blocked_averages+0x8f/0x700
[ 2303.244296] PGD 0
[ 2303.244314] Oops: 0000 [#1] SMP
[ 2303.244337] Modules linked in: fuse ipt_MASQUERADE nf_nat_masquerade_ipv4 nf_conntrack_netlink nfnetlink iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter xt_conntrack nf_nat nf_conntrack br_netfilter bridge dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio loop dm_mod 8021q garp mrp stp llc bonding sunrpc sb_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper cryptd iTCO_wdt iTCO_vendor_support mxm_wmi dcdbas joydev pcspkr lpc_ich mei_me mei ipmi_si sg ipmi_devintf ipmi_msghandler wmi acpi_power_meter ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common
[ 2303.244900] crc32c_intel drm ahci ixgbe libahci igb drm_panel_orientation_quirks libata megaraid_sas mdio i2c_algo_bit ptp pps_core dca
[ 2303.244994] CPU: 18 PID: 0 Comm: swapper/18 Kdump: loaded Not tainted 3.10.0-957.el7.x86_64 #1
[ 2303.245017] Hardware name: Dell Inc. PowerEdge R730xd/0WCJNT, BIOS 2.5.5 08/16/2017
[ 2303.245297] task: ffff914fed5e2080 ti: ffff914fed000000 task.ti: ffff914fed000000
[ 2303.245860] RIP: 0010:[<ffffffff8b6dd56f>] [<ffffffff8b6dd56f>] update_blocked_averages+0x8f/0x700
[ 2303.246399] RSP: 0018:ffff915ace843de0 EFLAGS: 00010002
[ 2303.246723] RAX: 0000000000000012 RBX: ffff915abfeb20c0 RCX: 0000006e6f697373
[ 2303.247011] RDX: 0000001e00730102 RSI: 000000000000000c RDI: 0000000000000000
[ 2303.247280] RBP: ffff915ace843e48 R08: ffff915aca373e00 R09: 0000000000000000
[ 2303.247568] R10: 0000000000000000 R11: 000000000000bb91 R12: ffff91596c2fe600
[ 2303.247805] R13: ffff915ac7b8ac00 R14: ffff915ace85ab80 R15: ffff915ace85b3f0
[ 2303.248070] FS: 0000000000000000(0000) GS:ffff915ace840000(0000) knlGS:0000000000000000
[ 2303.248614] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2303.248848] CR2: 0000006e6f697403 CR3: 0000001190410000 CR4: 00000000003607e0
[ 2303.249101] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2303.249371] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2303.249628] Call Trace:
[ 2303.249855] <IRQ>
[ 2303.249865] [<ffffffff8b6e4e3d>] rebalance_domains+0x4d/0x2b0
[ 2303.250339] [<ffffffff8b6e51c2>] run_rebalance_domains+0x122/0x1e0
[ 2303.250631] [<ffffffff8b6a0f05>] __do_softirq+0xf5/0x280
[ 2303.250882] [<ffffffff8bd7832c>] call_softirq+0x1c/0x30
[ 2303.251151] [<ffffffff8b62e675>] do_softirq+0x65/0xa0
[ 2303.251419] [<ffffffff8b6a1285>] irq_exit+0x105/0x110
[ 2303.251691] [<ffffffff8bd796c8>] smp_apic_timer_interrupt+0x48/0x60
[ 2303.251959] [<ffffffff8bd75df2>] apic_timer_interrupt+0x162/0x170
[ 2303.252194] <EOI>
[ 2303.252224] [<ffffffff8bbadfb7>] ? cpuidle_enter_state+0x57/0xd0
[ 2303.252719] [<ffffffff8bbae10e>] cpuidle_idle_call+0xde/0x230
[ 2303.252956] [<ffffffff8b6366de>] arch_cpu_idle+0xe/0xc0
[ 2303.253193] [<ffffffff8b6fc3ba>] cpu_startup_entry+0x14a/0x1e0
[ 2303.253449] [<ffffffff8b657db7>] start_secondary+0x1f7/0x270
[ 2303.253736] [<ffffffff8b6000d5>] start_cpu+0x5/0x14
[ 2303.253983] Code: 48 39 c7 4c 8d a0 50 ff ff ff 0f 84 ab 01 00 00 0f 1f 40 00 49 8b 94 24 c0 00 00 00 49 63 86 30 09 00 00 48 8b 4a 40 48 8b 52 48 <48> 8b 1c c1 4c 8b 2c c2 e9 54 01 00 00 be 01 00 00 00 4c 89 ef
[ 2303.255117] RIP [<ffffffff8b6dd56f>] update_blocked_averages+0x8f/0x700
[ 2303.255374] RSP <ffff915ace843de0>
[ 2303.255637] CR2: 0000006e6f697403
```

I use crash util disassembly *update_blocked_averages+0x8f/0x700* as below

```
/usr/src/debug/kernel-3.10.0-957.el7/linux-3.10.0-957.el7.x86_64/kernel/sched/fair.c: 5402
0xffffffff8b6dd56f <update_blocked_averages+0x8f>: mov (%rcx,%rax,8),%rbx
```

and with the registeres `RAX: 0000000000000012 RBX: ffff915abfeb20c0 RCX: 0000006e6f697373` we can get ` (%rcx,%rax,8)` address is just the error address `0000006e6f697373 + 8 * 0x12 = 0000006e6f697403`

But I don't know why it happened and how to fix this.
TagsNo tags attached.
abrt_hash
URL

Activities

TrevorH

TrevorH

2020-01-21 11:48

manager   ~0036082

Only the latest version is supported. Please yum update to 7.7 and kernel 3.10.0-1062.9.1.el7 and retest.
chengwei

chengwei

2020-01-22 02:13

reporter   ~0036087

OK, we'll test on the latest centos 7.7 kernel.

Issue History

Date Modified Username Field Change
2020-01-21 09:24 chengwei New Issue
2020-01-21 11:48 TrevorH Note Added: 0036082
2020-01-22 02:13 chengwei Note Added: 0036087