View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0015468 | CentOS-6 | kernel | public | 2018-11-15 16:23 | 2019-02-20 22:02 |
Reporter | Oscar Santos | ||||
Priority | high | Severity | crash | Reproducibility | always |
Status | new | Resolution | open | ||
Platform | Dell R930 | OS | OS Version | ||
Product Version | 6.9 | ||||
Target Version | Fixed in Version | ||||
Summary | 0015468: Kernel panic - srp_long_wq/2 after multipath failure and ib_srp reconnect | ||||
Description | # cat /etc/centos-release CentOS release 6.9 (Final) # uname -r 2.6.32-696.23.1.el6.x86_64 Server incurs kernel panic after srp abort which is followed by ib_srp reconnect. The srp events are typically attributed to lose of infiniband connectivity lost to NetApp disk storage set up with multi-pathing. Excuse the various forms of event; various logs excerpts being used to paint the picture of the problem: Path to disk storage compromised and 'hard LOCKUP' reported: <4>[161223.525288] device-mapper: multipath: Failing path 133:112. <4>[161223.525294] device-mapper: multipath: Failing path 133:176. <3>[161257.723670] scsi host34: ib_srp: failed receive status 5 for iu ffff88604562aec0 <3>[161267.731688] scsi host34: SRP abort called <6>[161267.773814] scsi host34: ib_srp: reconnect succeeded <0>[161317.053857] Watchdog detected hard LOCKUP on cpu 2 <4>[161317.059126] Modules linked in: iptable_filter ip_tables VSMqfs(P)(U) ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad mlx5_ib ib_core ib_addr dm_round_robin scsi_dh_rdac dm_multipath ipmi_devintf iTCO_wdt iTCO_vendor_support wmi power_meter acpi_ipmi ipmi_si ipmi_msghandler ch osst st mpt3sas sb_edac edac_core i2c_i801 sr_mod cdrom lpc_ich mfd_core shpchp qla2xxx scsi_transport_fc scsi_tgt nfs lockd fscache auth_rpcgss nfs_acl sunrpc hpilo udf crc_itu_t vfat fat bridge ipv6 stp llc xfs exportfs isci libsas qla3xxx hpsa e1000e ixgbe igb dca i2c_algo_bit i2c_core aacraid aic79xx ata_piix mpt2sas raid_class mptspi scsi_transport_spi mptsas mptscsih mptbase scsi_transport_sas arcmsr sata_nv sata_svw bnx2x libcrc32c mdio bnx2 ext4 jbd2 mbcache sata_sil e1000 tun sg sd_mod crc_t10dif mlx5_core ahci megaraid_sas tg3 ptp pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan] <4>[161317.059415] Pid: 9504, comm: srp_long_wq/2 Tainted: P – ------------ 2.6.32-696.23.1.el6.x86_64 #1 <4>[161317.059417] Call Trace: <4>[161317.059419] <NMI> [<ffffffff810f5131>] ? watchdog_overflow_callback+0xf1/0x110 <4>[161317.059452] [<ffffffff8112e31a>] ? __perf_event_overflow+0xaa/0x240 <4>[161317.059457] [<ffffffff8101d188>] ? x86_perf_event_set_period+0xf8/0x180 Another instance of the kernel panic with more complete call trace: 2018-05-09 10:24:28 [573163.155722] Pid: 9454, comm: srp_long_wq/1 Not tainted 2.6.32-696.23.1.el6.x86_64 #1 2018-05-09 10:24:28 [573163.164488] Call Trace: 2018-05-09 10:24:28 [573163.167326] <NMI> [<ffffffff81552643>] ? panic+0xa7/0x18b 2018-05-09 10:24:28 [573163.173682] [<ffffffff8100fac5>] ? show_trace+0x15/0x20 2018-05-09 10:24:28 [573163.179725] [<ffffffff810f5150>] ? watchdog_timer_fn+0x0/0x230 2018-05-09 10:24:28 [573163.186450] [<ffffffff8112e31a>] ? __perf_event_overflow+0xaa/0x240 2018-05-09 10:24:28 [573163.193658] [<ffffffff8101d188>] ? x86_perf_event_set_period+0xf8/0x180 2018-05-09 10:24:28 [573163.201257] [<ffffffff8112e974>] ? perf_event_overflow+0x14/0x20 2018-05-09 10:24:28 [573163.208175] [<ffffffff810246ec>] ? intel_pmu_handle_irq+0x21c/0x4a0 2018-05-09 10:24:28 [573163.215388] [<ffffffff815570f7>] ? save_paranoid+0xf7/0x110 2018-05-09 10:24:28 [573163.221822] [<ffffffff8155883f>] ? perf_event_nmi_handler+0x3f/0xb0 2018-05-09 10:24:28 [573163.229030] [<ffffffff8155a350>] ? notifier_call_chain+0x50/0x80 2018-05-09 10:24:28 [573163.235949] [<ffffffff815570eb>] ? save_paranoid+0xeb/0x110 2018-05-09 10:24:28 [573163.242383] [<ffffffff8155a3ba>] ? atomic_notifier_call_chain+0x1a/0x20 2018-05-09 10:24:28 [573163.249983] [<ffffffff810aeede>] ? notify_die+0x2e/0x30 2018-05-09 10:24:28 [573163.256027] [<ffffffff81557ea9>] ? do_nmi+0xd9/0x360 2018-05-09 10:24:28 [573163.261779] [<ffffffff815577a1>] ? nmi+0x131/0x190 2018-05-09 10:24:28 [573163.267336] [<ffffffff815562df>] ? _spin_lock_irqsave+0x2f/0x40 2018-05-09 10:24:28 [573163.274409] <<EOE>> [<ffffffff8139faa0>] ? __scsi_iterate_devices+0x30/0xa0 | ||||
Steps To Reproduce | Tied to 5/9 call trace, port 16 (storage port) cycles between down and up states: May 9 10:20:16 IBSW2 portd[5115]: Interface IB 1/16 changed state to DOWN May 9 10:21:04 IBSW2 portd[5115]: Interface IB 1/16 changed state to UP May 9 10:21:06 IBSW2 portd[5115]: Interface IB 1/16 changed state to DOWN May 9 10:21:56 IBSW2 portd[5115]: Interface IB 1/16 changed state to UP Server multi-path ports go down after panic: May 9 10:25:44 IBSW2 portd[5115]: Interface IB 1/4 changed state to DOWN May 9 10:25:44 IBSW2 portd[5115]: Interface IB 1/5 changed state to DOWN Server multi-path ports go up (recover) post panic: May 9 10:32:52 IBSW2 portd[5115]: Interface IB 1/4 changed state to UP May 9 10:32:53 IBSW2 portd[5115]: Interface IB 1/5 changed state to UP | ||||
Additional Information | Initial assessment by internal R&D staff: There are two running processes. One is the kthread "srp_long_wq/2", pid 9504 #11 [ffff8840f0c09f50] nmi at ffffffff815577a1 [exception RIP: _spin_lock_irqsave+47] RIP: ffffffff815562df RSP: ffff88203bb6fca0 RFLAGS: 00000097 RAX: 000000000000858b RBX: ffff886031f11c00 RCX: 000000000000858a RDX: 0000000000000246 RSI: 0000000000000000 RDI: ffff886031843050 RBP: ffff88203bb6fca0 R8: dead000000200200 R9: 0000000000000000 R10: 0000000000000005 R11: 000000000000008c R12: ffff886031843000 R13: 0000000000000000 R14: ffff886031843000 R15: ffff886031843180 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #12 [ffff88203bb6fca0] _spin_lock_irqsave at ffffffff815562df #13 [ffff88203bb6fca8] __scsi_iterate_devices at ffffffff8139faa0 #14 [ffff88203bb6fce8] starget_for_each_device at ffffffff8139fb5e #15 [ffff88203bb6fd28] target_unblock at ffffffff813a7b98 #16 [ffff88203bb6fd48] device_for_each_child at ffffffff8138559e #17 [ffff88203bb6fd88] __scsi_target_unblock at ffffffff813a7c60 #18 [ffff88203bb6fdb8] srp_reconnect_rport at ffffffffa0cab3dc [scsi_transport_srp] #19 [ffff88203bb6fe08] srp_reconnect_work at ffffffffa0cab50c [scsi_transport_srp] #20 [ffff88203bb6fe38] worker_thread at ffffffff810a1d32 The other is an sfind task, pid 181252, which seems unrelated to the problem. The kthread is spinning on a scsi_host->host_lock in %rdi, ffff886031843050 crash_x86_64> spinlock_t -x ffff886031843050 struct spinlock_t { raw_lock = { slock = 0x858c858a The spinlock value indicates that something holds it, and something is trying to acquire it. The current holder has "ticket" 0x858a, the task trying to acquire it is waiting for ticket 0x858b, and the next acquirer will get ticket 0x858c. The srp kthread's %rax holds the ticket it's waiting for, 0x858b. %rcx holds the thread's view of the current ticket of the lock, 0x858a, which matches the contents of memory. A spinlock should only be held by a running task. The other running task, sfind pid 181252, does not appear by its stack to be holding the spinlock, shown below. The three most likely explanations are 1. Some task has incorrectly called schedule() while holding the spinlock. 2. Some task has incorrectly returned while holding the spinlock, when it should have released the lock before returning. It could be a task that's now running, sleeping, or exited. 3. The srp kthread already holds the spinlock when it tries to acquire it here. For the first case, (schedule()) it'd most likely be a task that's operating on the Scsi_Host object whose lock is at issue. Judging by the stack traces, these are the ones I'd look at: PID PPID CPU TASK ST %MEM VSZ RSS COMM 12114 2 6 ffff88402e829520 UN 0.0 0 0 [scsi_eh_34] 185783 28626 1 ffff887a4fa51520 UN 0.0 2669040 69068 cmd 10861 10849 6 ffff88804e5c0040 UN 0.0 122304 1564 srp_daemon 10864 10849 6 ffff88804e595520 UN 0.0 122304 1564 srp_daemon There are also these "sam"/"VSMqfs" related tasks. If the VSMqfs code operates directly on Scsi_Host objects, I'd be suspicious of it. If it doesn't, then these are likely unrelated. PID PPID CPU TASK ST %MEM VSZ RSS COMM 29688 1 7 ffff88204ccaaab0 IN 0.0 291288 3112 sam-fsd 30508 29688 8 ffff88604688e040 IN 0.0 24208 972 sam-stagealld 30516 30506 50 ffff888016ab9520 IN 0.0 552176 91656 sam-arfind 30564 2 1 ffff886047ce2040 UN 0.0 0 0 [sam_statfs_thre] 30565 30512 13 ffff88203ba7cab0 IN 0.0 381876 4676 sam-arfind 30576 29688 22 ffff88804ec34040 IN 0.0 450344 4712 sam-amld 30690 29688 6 ffff888047b73520 IN 0.0 597880 15860 sam-stagerd 47588 29688 2 ffff8860323c2ab0 IN 0.0 502452 6604 sam-sharefsd 47631 29688 12 ffff88604548eab0 IN 0.0 502452 6604 sam-sharefsd What is VSMqfs? I can't find any information on it. (3rd-party partner's response pertaining to "What is VSMqfs": We don't operate on the scsi host object directly in the file system, just through the block device interfaces. The sam_statfs_thread looks like its doing something because of the sfind running. Likely this was a path failure on the metadata device which blocked our statfs thread and possibly blocked the sfind) For the third case (srp kthread already holds the lock), one would need to investigate the functions on the stack of the srp kthread. For the second case (task returned while holding the lock), the only way to figure it out would be by code inspection of all paths in the kernel and the kernel modules loaded that acquire the ->host_lock. For any of these cases, the exact source code for the kernel and kernel modules would be needed. The other running task was sfind pid 181252. I think it's unrelated, as the code on the stack doesn't seem close to interacting with a Scsi_Host object: [exception RIP: inode_has_perm+68] RIP: ffffffff81247cf4 RSP: ffff8856acb2b968 RFLAGS: 00000282 RAX: 0000000000000007 RBX: ffff8856acb2bc18 RCX: ffff8856acb2ba08 RDX: 0000000000200000 RSI: ffff881ffaaddac8 RDI: ffff886046a8b780 RBP: ffff8856acb2b9f8 R8: ffff8856acb2ba08 R9: 0000000000000001 R10: ffff880a18a05228 R11: ffff8856acb2ba08 R12: ffff886a405dc035 R13: ffff881ffaaddac8 R14: 0000000000000000 R15: ffff8860329a9520 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 --- <NMI exception stack> --- #6 [ffff8856acb2b968] inode_has_perm at ffffffff81247cf4 #7 [ffff8856acb2ba00] selinux_inode_permission at ffffffff8124850a #8 [ffff8856acb2baa0] security_inode_permission at ffffffff8123f7f5 #9 [ffff8856acb2bab0] __link_path_walk at ffffffff811ad1ed #10 [ffff8856acb2bb90] path_walk at ffffffff811ae3ea #11 [ffff8856acb2bbd0] filename_lookup at ffffffff811ae5fb #12 [ffff8856acb2bc10] user_path_at at ffffffff811af727 #13 [ffff8856acb2bce0] stat_file_common at ffffffffa0cdc51f [VSMqfs] #14 [ffff8856acb2bd40] sam_syscall at ffffffffa0cdc7b8 [VSMqfs] #15 [ffff8856acb2bdf0] samsys_ioctl at ffffffffa0cf153b [VSMqfs] #16 [ffff8856acb2be30] vfs_ioctl at ffffffff811b2779 #17 [ffff8856acb2be70] do_vfs_ioctl at ffffffff811b2934 #18 [ffff8856acb2bf00] sys_ioctl at ffffffff811b2ec1 #19 [ffff8856acb2bf50] system_call_fastpath at ffffffff8155e351 | ||||
Tags | 2.6.32-696.23.1.el6.x86_64 | ||||
Note, the same Call Trace is exhibited within https://bugs.openfabrics.org/show_bug.cgi?id=2602. As in CentOS case, multi-path is compromised when the PANIC is incurred. The difference is the Open Fabrics bug is with OFED and RHEL 6.6 while this CentOS bug occurs with MOFED. | |
Date Modified | Username | Field | Change |
---|---|---|---|
2018-11-15 16:23 | Oscar Santos | New Issue | |
2018-11-15 16:23 | Oscar Santos | Tag Attached: 2.6.32-696.23.1.el6.x86_64 | |
2019-02-20 22:02 | Oscar Santos | Note Added: 0033870 |