View Issue Details

IDProjectCategoryView StatusLast Update
0012590CentOS-7kernelpublic2017-01-11 08:55
Reporterjekader 
PrioritynormalSeveritycrashReproducibilitysometimes
Status newResolutionopen 
Platformppc64leOSCentOSOS Version7.3
Product Version7.3.1611 
Target VersionFixed in Version 
Summary0012590: kernel crash due to KSM run on 7.3
DescriptionWe witnessed a crash of an oVirt hypervisor running on the ppc64le platform after upgrading it to 7.3


[ 2368.545615] Unable to handle kernel paging request for data at address 0x00000000
[ 2368.545622] Faulting instruction address: 0xc0000000002dcc10
[ 2368.545626] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2368.545628] SMP NR_CPUS=2048 NUMA PowerNV
[ 2368.545632] Modules linked in: vhost_net vhost macvtap macvlan ebt_arp ebtable_nat tun ebtable_filter ebtables ip6table_filter ip6_tables scsi_transport_iscsi xt_physdev br_netfilter ipt_REJECT nf_reject_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_multiport kvm_hv kvm xt_conntrack nf_conntrack iptable_filter softdog ext4 mbcache jbd2 ses enclosure scsi_transport_sas sg shpchp rtc_opal powernv_rng nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c dm_service_time sd_mod sr_mod cdrom lpfc dm_multipath ipr libata crc_t10dif tg3 crct10dif_generic scsi_transport_fc ptp scsi_tgt pps_core crct10dif_common dm_mirror dm_region_hash dm_log dm_mod 8021q garp mrp bridge stp llc bonding
[ 2368.545673] CPU: 56 PID: 423 Comm: ksmd Not tainted 3.10.0-514.2.2.el7.ppc64le #1
[ 2368.545676] task: c000000fe580d3e0 ti: c000000fe58a8000 task.ti: c000000fe58a8000
[ 2368.545679] NIP: c0000000002dcc10 LR: c0000000002dcbf8 CTR: 0000000000000000
[ 2368.545682] REGS: c000000fe58ab920 TRAP: 0300 Not tainted (3.10.0-514.2.2.el7.ppc64le)
[ 2368.545684] MSR: 9000000100009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28002042 XER: 20000000
[ 2368.545692] CFAR: c000000000009368 DAR: 0000000000000000 DSISR: 42000000 SOFTE: 1
GPR00: c0000000002dcbf8 c000000fe58abba0 c0000000011a7c00 f000000002e5e908
GPR04: f000000002e5e908 0000000000000000 0000000000000000 0000000000000000
GPR08: c000000004553190 0000000000000000 0000000000000000 0000000000000000
GPR12: 0000000000002200 c000000007b5f800 c000000d88c51300 f00000000612ba98
GPR16: c000001cce95ef40 c000000fd9f0b008 6db6db6db6db6db7 c000000fe58a8000
GPR20: c000000d882ec280 1000000000000000 c000001bc3550000 c000000fe58a8000
GPR24: 0000000000000000 c000000d88efb488 fffffffffffff000 c00000000155b2f0
GPR28: c0000000010da360 f000000000000000 0000000000000001 0000000000000000
[ 2368.545726] NIP [c0000000002dcc10] ksm_do_scan+0xfb0/0x1c80
[ 2368.545729] LR [c0000000002dcbf8] ksm_do_scan+0xf98/0x1c80
[ 2368.545731] Call Trace:
[ 2368.545734] [c000000fe58abba0] [c0000000002dcbf8] ksm_do_scan+0xf98/0x1c80 (unreliable)
[ 2368.545738] [c000000fe58abce0] [c0000000002dda10] ksm_scan_thread+0x130/0x330
[ 2368.545741] [c000000fe58abd80] [c0000000001146ec] kthread+0xec/0x100
[ 2368.545745] [c000000fe58abe30] [c00000000000a47c] ret_from_kernel_thread+0x5c/0xe0
[ 2368.545748] Instruction dump:
[ 2368.545749] 60000000 4bfffa80 79240764 4bfffcec e8610020 4bf8c6f5 60000000 7faea040
[ 2368.545754] 419e012c e9340008 e9540010 2fa90000 <f92a0000> 419e0008 f9490008 e93b0000
[ 2368.545762] ---[ end trace 6f3f1790cfe9a4ea ]---
Steps To ReproduceIn our case the setup is as follows:
 IBM S812L (8247-21L)
 kernel-3.10.0-514.2.2.el7.ppc64le
 vdsm-4.18.999-1020.git1ff41b1.el7.centos.ppc64le

14 VMs running with 8GB RAM each, I was upgrading them from CentOS 7.2 to 7.3 when this happened.

For now we've disabled KSM on oVirt side to make the environment more stable.
Tagskerneloops
abrt_hash
URL

Activities

jekader

jekader

2017-01-11 08:55

reporter   ~0028324

I wanted to look at the vmcore more closely. I could not find debuginfo packages for ppc64le however. Is there any official way of getting those? The repo documented in [1] doesn't have them even though it does have aarc64 packages [2]
Couldn't find the package on CentOS Koji either.

Regards,
Evgheni Dereveanchin

[1] https://wiki.centos.org/AdditionalResources/Repositories/DebugInfo
[2] http://debuginfo.centos.org/7/

Issue History

Date Modified Username Field Change
2017-01-06 15:08 jekader New Issue
2017-01-06 15:08 jekader Tag Attached: kerneloops
2017-01-11 08:55 jekader Note Added: 0028324