View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0017610 | CentOS-7 | iscsi-initiator-utils | public | 2020-07-21 13:57 | 2020-07-21 17:43 |
Reporter | anandgbhat | Assigned To | |||
Priority | urgent | Severity | crash | Reproducibility | always |
Status | new | Resolution | open | ||
Product Version | 7.8-2003 | ||||
Summary | 0017610: iscsid stuck in D state causing all the iscsi traffic to stall | ||||
Description | iscsid gets stuck into D state when iscsi target flips. A single server is configured with multiple iscsi targets with multiple LUNs (>100) exposed to the host. We see occasional iSCSI issues wherein iscsid gets stuck in D state when the iscsi target toggles. This is typically seen when multiple iSCSI targets toggle at the same time. Here is the state of iscsid: [root@system-test-01-bqkp70202642-node-2 cohesity]# ps afx | grep iscsid 4326 pts/0 S+ 0:00 \_ grep --color=auto iscsid 19795 ? D<Ls 0:37 /sbin/iscsid -f 24288 ? D< 0:00 \_ /sbin/iscsid -f [root@system-test-01-bqkp70202642-node-2 cohesity]# ps afx | less [root@system-test-01-bqkp70202642-node-2 cohesity]# ps aux | grep iscsid root 19795 0.0 0.0 61044 9932 ? D<Ls Jul10 0:37 /sbin/iscsid -f root 20846 0.0 0.0 112812 968 pts/0 S+ 03:54 0:00 grep --color=auto iscsid root 24288 0.0 0.0 61044 3436 ? D< Jul12 0:00 /sbin/iscsid -f [root@system-test-01-bqkp70202642-node-2 cohesity]# cat /proc/24288/stack [<ffffffff8395aa0b>] blk_execute_rq+0xab/0x150 [<ffffffff83ae89d3>] scsi_execute+0xd3/0x170 [<ffffffff83aea8ae>] scsi_execute_req_flags+0x8e/0x100 [<ffffffff83aee1b3>] scsi_probe_and_add_lun+0x243/0xe50 [<ffffffff83aef172>] scsi_report_lun_scan+0x3b2/0x540 [<ffffffff83aef731>] __scsi_scan_target+0x121/0x260 [<ffffffff83aef988>] scsi_scan_target+0x118/0x130 [<ffffffffc089132b>] iscsi_user_scan_session.part.13+0xdb/0x110 [scsi_transport_iscsi] [<ffffffffc0891381>] iscsi_user_scan_session+0x21/0x30 [scsi_transport_iscsi] [<ffffffff83ab4c45>] device_for_each_child+0x55/0x90 [<ffffffffc088efb3>] iscsi_user_scan+0x43/0x60 [scsi_transport_iscsi] [<ffffffff83af1918>] store_scan+0xa8/0x100 [<ffffffff83ab413b>] dev_attr_store+0x1b/0x30 [<ffffffff838da472>] sysfs_kf_write+0x42/0x50 [<ffffffff838d9a5b>] kernfs_fop_write+0xeb/0x160 [<ffffffff8384d1b0>] vfs_write+0xc0/0x1f0 [<ffffffff8384df7f>] SyS_write+0x7f/0xf0 [<ffffffff83d92ed2>] system_call_fastpath+0x25/0x2a [<ffffffffffffffff>] 0xffffffffffffffff [root@system-test-01-bqkp70202642-node-2 cohesity]# cat /proc/19795/stack [<ffffffff836bd6cd>] flush_workqueue+0x13d/0x5e0 [<ffffffff83ae35cd>] scsi_flush_work+0x1d/0x50 [<ffffffffc0890705>] iscsi_remove_session+0xd5/0x1c0 [scsi_transport_iscsi] [<ffffffffc0890a32>] iscsi_destroy_session+0x12/0x50 [scsi_transport_iscsi] [<ffffffffc0d2e6f8>] iscsi_session_teardown+0xd8/0x100 [libiscsi] [<ffffffffc09bffa0>] iscsi_sw_tcp_session_destroy+0x50/0x70 [iscsi_tcp] [<ffffffffc0892301>] iscsi_if_recv_msg+0xc81/0x14f0 [scsi_transport_iscsi] [<ffffffffc0892c3b>] iscsi_if_rx+0xcb/0x230 [scsi_transport_iscsi] [<ffffffff83c90ce0>] netlink_unicast+0x170/0x210 [<ffffffff83c91088>] netlink_sendmsg+0x308/0x420 [<ffffffff83c333a6>] sock_sendmsg+0xb6/0xf0 [<ffffffff83c34269>] ___sys_sendmsg+0x3e9/0x400 [<ffffffff83c35921>] __sys_sendmsg+0x51/0x90 [<ffffffff83c35972>] SyS_sendmsg+0x12/0x20 [<ffffffff83d92ed2>] system_call_fastpath+0x25/0x2a [<ffffffffffffffff>] 0xffffffffffffffff Here are the iscsid and kernel versions: $ uname -a Linux system-test-01-bqkp80201813-node-4 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux $ iscsid --version iscsid version 6.2.0.874-17 Crashed the kernel using sysrq to collect the crash dump and logs. Attached the vmcore dmesg along. | ||||
Steps To Reproduce | Map multiple LUNs from mutiple iSCSI targets. Continuously do IOs on each of the LUNs while flipping the iSCSI targets. iscsid gets into D state | ||||
Tags | 3.10.0.-1127 | ||||
abrt_hash | |||||
URL | |||||
CentOS is a rebuild of the sources used to create RHEL and aims to reproduce RHEL bug for bug and feature for feature. Please submit your request to Redhat via bugzilla.redhat.com and if/when RH accepts it and incorporates it into RHEL and releases a patched version, then CentOS will pick it up automatically. For easier tracking, please crosslink this bug with the one opened at bugzilla.redhat.com. |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2020-07-21 13:57 | anandgbhat | New Issue | |
2020-07-21 13:57 | anandgbhat | File Added: vmcore-dmesg.webarchive | |
2020-07-21 13:57 | anandgbhat | Tag Attached: 3.10.0.-1127 | |
2020-07-21 17:43 | ManuelWolfshant | Note Added: 0037387 |