View Issue Details

IDProjectCategoryView StatusLast Update
0015468CentOS-6kernelpublic2019-02-20 22:02
ReporterOscar Santos 
PriorityhighSeveritycrashReproducibilityalways
Status newResolutionopen 
PlatformDell R930OSOS Version
Product Version6.9 
Target VersionFixed in Version 
Summary0015468: Kernel panic - srp_long_wq/2 after multipath failure and ib_srp reconnect
Description# cat /etc/centos-release
CentOS release 6.9 (Final)
# uname -r
2.6.32-696.23.1.el6.x86_64


Server incurs kernel panic after srp abort which is followed by ib_srp reconnect. The srp events are typically attributed to lose of infiniband connectivity lost to NetApp disk storage set up with multi-pathing.

Excuse the various forms of event; various logs excerpts being used to paint the picture of the problem:

Path to disk storage compromised and 'hard LOCKUP' reported:
<4>[161223.525288] device-mapper: multipath: Failing path 133:112.
 <4>[161223.525294] device-mapper: multipath: Failing path 133:176.
 <3>[161257.723670] scsi host34: ib_srp: failed receive status 5 for iu ffff88604562aec0
 <3>[161267.731688] scsi host34: SRP abort called
 <6>[161267.773814] scsi host34: ib_srp: reconnect succeeded
 <0>[161317.053857] Watchdog detected hard LOCKUP on cpu 2
<4>[161317.059126] Modules linked in: iptable_filter ip_tables VSMqfs(P)(U) ib_srp scsi_transport_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm ib_sa ib_mad mlx5_ib ib_core ib_addr dm_round_robin scsi_dh_rdac dm_multipath ipmi_devintf iTCO_wdt iTCO_vendor_support wmi power_meter acpi_ipmi ipmi_si ipmi_msghandler ch osst st mpt3sas sb_edac edac_core i2c_i801 sr_mod cdrom lpc_ich mfd_core shpchp qla2xxx scsi_transport_fc scsi_tgt nfs lockd fscache auth_rpcgss nfs_acl sunrpc hpilo udf crc_itu_t vfat fat bridge ipv6 stp llc xfs exportfs isci libsas qla3xxx hpsa e1000e ixgbe igb dca i2c_algo_bit i2c_core aacraid aic79xx ata_piix mpt2sas raid_class mptspi scsi_transport_spi mptsas mptscsih mptbase scsi_transport_sas arcmsr sata_nv sata_svw bnx2x libcrc32c mdio bnx2 ext4 jbd2 mbcache sata_sil e1000 tun sg sd_mod crc_t10dif mlx5_core ahci megaraid_sas tg3 ptp pps_core dm_mirror dm_region_hash dm_log dm_mod [last unloaded: scsi_wait_scan]
 <4>[161317.059415] Pid: 9504, comm: srp_long_wq/2 Tainted: P – ------------ 2.6.32-696.23.1.el6.x86_64 #1
 <4>[161317.059417] Call Trace:
 <4>[161317.059419] <NMI> [<ffffffff810f5131>] ? watchdog_overflow_callback+0xf1/0x110
 <4>[161317.059452] [<ffffffff8112e31a>] ? __perf_event_overflow+0xaa/0x240
 <4>[161317.059457] [<ffffffff8101d188>] ? x86_perf_event_set_period+0xf8/0x180

Another instance of the kernel panic with more complete call trace:
2018-05-09 10:24:28 [573163.155722] Pid: 9454, comm: srp_long_wq/1 Not tainted 2.6.32-696.23.1.el6.x86_64 #1
2018-05-09 10:24:28 [573163.164488] Call Trace:
2018-05-09 10:24:28 [573163.167326] <NMI> [<ffffffff81552643>] ? panic+0xa7/0x18b
2018-05-09 10:24:28 [573163.173682] [<ffffffff8100fac5>] ? show_trace+0x15/0x20
2018-05-09 10:24:28 [573163.179725] [<ffffffff810f5150>] ? watchdog_timer_fn+0x0/0x230
2018-05-09 10:24:28 [573163.186450] [<ffffffff8112e31a>] ? __perf_event_overflow+0xaa/0x240
2018-05-09 10:24:28 [573163.193658] [<ffffffff8101d188>] ? x86_perf_event_set_period+0xf8/0x180
2018-05-09 10:24:28 [573163.201257] [<ffffffff8112e974>] ? perf_event_overflow+0x14/0x20
2018-05-09 10:24:28 [573163.208175] [<ffffffff810246ec>] ? intel_pmu_handle_irq+0x21c/0x4a0
2018-05-09 10:24:28 [573163.215388] [<ffffffff815570f7>] ? save_paranoid+0xf7/0x110
2018-05-09 10:24:28 [573163.221822] [<ffffffff8155883f>] ? perf_event_nmi_handler+0x3f/0xb0
2018-05-09 10:24:28 [573163.229030] [<ffffffff8155a350>] ? notifier_call_chain+0x50/0x80
2018-05-09 10:24:28 [573163.235949] [<ffffffff815570eb>] ? save_paranoid+0xeb/0x110
2018-05-09 10:24:28 [573163.242383] [<ffffffff8155a3ba>] ? atomic_notifier_call_chain+0x1a/0x20
2018-05-09 10:24:28 [573163.249983] [<ffffffff810aeede>] ? notify_die+0x2e/0x30
2018-05-09 10:24:28 [573163.256027] [<ffffffff81557ea9>] ? do_nmi+0xd9/0x360
2018-05-09 10:24:28 [573163.261779] [<ffffffff815577a1>] ? nmi+0x131/0x190
2018-05-09 10:24:28 [573163.267336] [<ffffffff815562df>] ? _spin_lock_irqsave+0x2f/0x40
2018-05-09 10:24:28 [573163.274409] <<EOE>> [<ffffffff8139faa0>] ? __scsi_iterate_devices+0x30/0xa0
Steps To ReproduceTied to 5/9 call trace, port 16 (storage port) cycles between down and up states:
May 9 10:20:16 IBSW2 portd[5115]: Interface IB 1/16 changed state to DOWN
May 9 10:21:04 IBSW2 portd[5115]: Interface IB 1/16 changed state to UP
May 9 10:21:06 IBSW2 portd[5115]: Interface IB 1/16 changed state to DOWN
May 9 10:21:56 IBSW2 portd[5115]: Interface IB 1/16 changed state to UP

Server multi-path ports go down after panic:
May 9 10:25:44 IBSW2 portd[5115]: Interface IB 1/4 changed state to DOWN
May 9 10:25:44 IBSW2 portd[5115]: Interface IB 1/5 changed state to DOWN

Server multi-path ports go up (recover) post panic:
May 9 10:32:52 IBSW2 portd[5115]: Interface IB 1/4 changed state to UP
May 9 10:32:53 IBSW2 portd[5115]: Interface IB 1/5 changed state to UP
Additional InformationInitial assessment by internal R&D staff:

There are two running processes.

One is the kthread "srp_long_wq/2", pid 9504

#11 [ffff8840f0c09f50] nmi at ffffffff815577a1
    [exception RIP: _spin_lock_irqsave+47]
    RIP: ffffffff815562df RSP: ffff88203bb6fca0 RFLAGS: 00000097
    RAX: 000000000000858b RBX: ffff886031f11c00 RCX: 000000000000858a
    RDX: 0000000000000246 RSI: 0000000000000000 RDI: ffff886031843050
    RBP: ffff88203bb6fca0 R8: dead000000200200 R9: 0000000000000000
    R10: 0000000000000005 R11: 000000000000008c R12: ffff886031843000
    R13: 0000000000000000 R14: ffff886031843000 R15: ffff886031843180
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
 
--- <NMI exception stack> ---
 
#12 [ffff88203bb6fca0] _spin_lock_irqsave at ffffffff815562df
#13 [ffff88203bb6fca8] __scsi_iterate_devices at ffffffff8139faa0
#14 [ffff88203bb6fce8] starget_for_each_device at ffffffff8139fb5e
#15 [ffff88203bb6fd28] target_unblock at ffffffff813a7b98
#16 [ffff88203bb6fd48] device_for_each_child at ffffffff8138559e
#17 [ffff88203bb6fd88] __scsi_target_unblock at ffffffff813a7c60
#18 [ffff88203bb6fdb8] srp_reconnect_rport at ffffffffa0cab3dc [scsi_transport_srp]
#19 [ffff88203bb6fe08] srp_reconnect_work at ffffffffa0cab50c [scsi_transport_srp]
#20 [ffff88203bb6fe38] worker_thread at ffffffff810a1d32

The other is an sfind task, pid 181252, which seems unrelated to the problem.

The kthread is spinning on a scsi_host->host_lock in %rdi, ffff886031843050

crash_x86_64> spinlock_t -x ffff886031843050
struct spinlock_t {
  raw_lock = {
    slock = 0x858c858a

The spinlock value indicates that something holds it, and something is
 trying to acquire it. The current holder has "ticket" 0x858a, the
 task trying to acquire it is waiting for ticket 0x858b, and the next
 acquirer will get ticket 0x858c.

The srp kthread's %rax holds the ticket it's waiting for, 0x858b.
 %rcx holds the thread's view of the current ticket of the lock,
 0x858a, which matches the contents of memory.

A spinlock should only be held by a running task.

The other running task, sfind pid 181252, does not appear by its stack
 to be holding the spinlock, shown below.

The three most likely explanations are

1. Some task has incorrectly called schedule() while holding the spinlock.

2. Some task has incorrectly returned while holding the spinlock, when
 it should have released the lock before returning. It could be a task
 that's now running, sleeping, or exited.

3. The srp kthread already holds the spinlock when it tries to acquire
 it here.

For the first case, (schedule()) it'd most likely be a task that's
 operating on the Scsi_Host object whose lock is at issue. Judging by
 the stack traces, these are the ones I'd look at:


   PID PPID CPU TASK ST %MEM VSZ RSS COMM
  12114 2 6 ffff88402e829520 UN 0.0 0 0 [scsi_eh_34]
 185783 28626 1 ffff887a4fa51520 UN 0.0 2669040 69068 cmd
  10861 10849 6 ffff88804e5c0040 UN 0.0 122304 1564 srp_daemon
  10864 10849 6 ffff88804e595520 UN 0.0 122304 1564 srp_daemon

There are also these "sam"/"VSMqfs" related tasks. If the VSMqfs code
 operates directly on Scsi_Host objects, I'd be suspicious of it. If
 it doesn't, then these are likely unrelated.

   PID PPID CPU TASK ST %MEM VSZ RSS COMM
  29688 1 7 ffff88204ccaaab0 IN 0.0 291288 3112 sam-fsd
  30508 29688 8 ffff88604688e040 IN 0.0 24208 972 sam-stagealld
  30516 30506 50 ffff888016ab9520 IN 0.0 552176 91656 sam-arfind
  30564 2 1 ffff886047ce2040 UN 0.0 0 0 [sam_statfs_thre]
  30565 30512 13 ffff88203ba7cab0 IN 0.0 381876 4676 sam-arfind
  30576 29688 22 ffff88804ec34040 IN 0.0 450344 4712 sam-amld
  30690 29688 6 ffff888047b73520 IN 0.0 597880 15860 sam-stagerd
  47588 29688 2 ffff8860323c2ab0 IN 0.0 502452 6604 sam-sharefsd
  47631 29688 12 ffff88604548eab0 IN 0.0 502452 6604 sam-sharefsd

What is VSMqfs? I can't find any information on it.
(3rd-party partner's response pertaining to "What is VSMqfs":
We don't operate on the scsi host object directly in the file system, just through
the block device interfaces. The sam_statfs_thread looks like its doing something
because of the sfind running. Likely this was a path failure on the metadata device
which blocked our statfs thread and possibly blocked the sfind)

For the third case (srp kthread already holds the lock), one would
 need to investigate the functions on the stack of the srp kthread.

For the second case (task returned while holding the lock), the only
 way to figure it out would be by code inspection of all paths in the
 kernel and the kernel modules loaded that acquire the ->host_lock.

For any of these cases, the exact source code for the kernel and
 kernel modules would be needed.

The other running task was sfind pid 181252. I think it's unrelated,
 as the code on the stack doesn't seem close to interacting with a
 Scsi_Host object:

    [exception RIP: inode_has_perm+68]
    RIP: ffffffff81247cf4 RSP: ffff8856acb2b968 RFLAGS: 00000282
    RAX: 0000000000000007 RBX: ffff8856acb2bc18 RCX: ffff8856acb2ba08
    RDX: 0000000000200000 RSI: ffff881ffaaddac8 RDI: ffff886046a8b780
    RBP: ffff8856acb2b9f8 R8: ffff8856acb2ba08 R9: 0000000000000001
    R10: ffff880a18a05228 R11: ffff8856acb2ba08 R12: ffff886a405dc035
    R13: ffff881ffaaddac8 R14: 0000000000000000 R15: ffff8860329a9520
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
 
--- <NMI exception stack> ---
 
 #6 [ffff8856acb2b968] inode_has_perm at ffffffff81247cf4
 #7 [ffff8856acb2ba00] selinux_inode_permission at ffffffff8124850a
 #8 [ffff8856acb2baa0] security_inode_permission at ffffffff8123f7f5
 #9 [ffff8856acb2bab0] __link_path_walk at ffffffff811ad1ed
#10 [ffff8856acb2bb90] path_walk at ffffffff811ae3ea
#11 [ffff8856acb2bbd0] filename_lookup at ffffffff811ae5fb
#12 [ffff8856acb2bc10] user_path_at at ffffffff811af727
#13 [ffff8856acb2bce0] stat_file_common at ffffffffa0cdc51f [VSMqfs]
#14 [ffff8856acb2bd40] sam_syscall at ffffffffa0cdc7b8 [VSMqfs]
#15 [ffff8856acb2bdf0] samsys_ioctl at ffffffffa0cf153b [VSMqfs]
#16 [ffff8856acb2be30] vfs_ioctl at ffffffff811b2779
#17 [ffff8856acb2be70] do_vfs_ioctl at ffffffff811b2934
#18 [ffff8856acb2bf00] sys_ioctl at ffffffff811b2ec1
#19 [ffff8856acb2bf50] system_call_fastpath at ffffffff8155e351
Tags2.6.32-696.23.1.el6.x86_64

Activities

Oscar Santos

Oscar Santos

2019-02-20 22:02

reporter   ~0033870

Note, the same Call Trace is exhibited within https://bugs.openfabrics.org/show_bug.cgi?id=2602. As in CentOS case, multi-path is compromised when the PANIC is incurred. The difference is the Open Fabrics bug is with OFED and RHEL 6.6 while this CentOS bug occurs with MOFED.

Issue History

Date Modified Username Field Change
2018-11-15 16:23 Oscar Santos New Issue
2018-11-15 16:23 Oscar Santos Tag Attached: 2.6.32-696.23.1.el6.x86_64
2019-02-20 22:02 Oscar Santos Note Added: 0033870