Summary0017610: iscsid stuck in D state causing all the iscsi traffic to stall
Descriptioniscsid gets stuck into D state when iscsi target flips. A single server is configured with multiple iscsi targets with multiple LUNs (>100) exposed to the host. We see occasional iSCSI issues wherein iscsid gets stuck in D state when the iscsi target toggles. This is typically seen when multiple iSCSI targets toggle at the same time.

Here is the state of iscsid:

[root@system-test-01-bqkp70202642-node-2 cohesity]# ps afx | grep iscsid
 4326 pts/0 S+ 0:00 \_ grep --color=auto iscsid
19795 ? D<Ls 0:37 /sbin/iscsid -f
24288 ? D< 0:00 \_ /sbin/iscsid -f
[root@system-test-01-bqkp70202642-node-2 cohesity]# ps afx | less
[root@system-test-01-bqkp70202642-node-2 cohesity]# ps aux | grep iscsid
root 19795 0.0 0.0 61044 9932 ? D<Ls Jul10 0:37 /sbin/iscsid -f
root 20846 0.0 0.0 112812 968 pts/0 S+ 03:54 0:00 grep --color=auto iscsid
root 24288 0.0 0.0 61044 3436 ? D< Jul12 0:00 /sbin/iscsid -f
[root@system-test-01-bqkp70202642-node-2 cohesity]# cat /proc/24288/stack
[<ffffffff8395aa0b>] blk_execute_rq+0xab/0x150
[<ffffffff83ae89d3>] scsi_execute+0xd3/0x170
[<ffffffff83aea8ae>] scsi_execute_req_flags+0x8e/0x100
[<ffffffff83aee1b3>] scsi_probe_and_add_lun+0x243/0xe50
[<ffffffff83aef172>] scsi_report_lun_scan+0x3b2/0x540
[<ffffffff83aef731>] __scsi_scan_target+0x121/0x260
[<ffffffff83aef988>] scsi_scan_target+0x118/0x130
[<ffffffffc089132b>] iscsi_user_scan_session.part.13+0xdb/0x110 [scsi_transport_iscsi]
[<ffffffffc0891381>] iscsi_user_scan_session+0x21/0x30 [scsi_transport_iscsi]
[<ffffffff83ab4c45>] device_for_each_child+0x55/0x90
[<ffffffffc088efb3>] iscsi_user_scan+0x43/0x60 [scsi_transport_iscsi]
[<ffffffff83af1918>] store_scan+0xa8/0x100
[<ffffffff83ab413b>] dev_attr_store+0x1b/0x30
[<ffffffff838da472>] sysfs_kf_write+0x42/0x50
[<ffffffff838d9a5b>] kernfs_fop_write+0xeb/0x160
[<ffffffff8384d1b0>] vfs_write+0xc0/0x1f0
[<ffffffff8384df7f>] SyS_write+0x7f/0xf0
[<ffffffff83d92ed2>] system_call_fastpath+0x25/0x2a
[<ffffffffffffffff>] 0xffffffffffffffff
[root@system-test-01-bqkp70202642-node-2 cohesity]# cat /proc/19795/stack
[<ffffffff836bd6cd>] flush_workqueue+0x13d/0x5e0
[<ffffffff83ae35cd>] scsi_flush_work+0x1d/0x50
[<ffffffffc0890705>] iscsi_remove_session+0xd5/0x1c0 [scsi_transport_iscsi]
[<ffffffffc0890a32>] iscsi_destroy_session+0x12/0x50 [scsi_transport_iscsi]
[<ffffffffc0d2e6f8>] iscsi_session_teardown+0xd8/0x100 [libiscsi]
[<ffffffffc09bffa0>] iscsi_sw_tcp_session_destroy+0x50/0x70 [iscsi_tcp]
[<ffffffffc0892301>] iscsi_if_recv_msg+0xc81/0x14f0 [scsi_transport_iscsi]
[<ffffffffc0892c3b>] iscsi_if_rx+0xcb/0x230 [scsi_transport_iscsi]
[<ffffffff83c90ce0>] netlink_unicast+0x170/0x210
[<ffffffff83c91088>] netlink_sendmsg+0x308/0x420
[<ffffffff83c333a6>] sock_sendmsg+0xb6/0xf0
[<ffffffff83c34269>] ___sys_sendmsg+0x3e9/0x400
[<ffffffff83c35921>] __sys_sendmsg+0x51/0x90
[<ffffffff83c35972>] SyS_sendmsg+0x12/0x20
[<ffffffff83d92ed2>] system_call_fastpath+0x25/0x2a
[<ffffffffffffffff>] 0xffffffffffffffff

Here are the iscsid and kernel versions:

$ uname -a
Linux system-test-01-bqkp80201813-node-4 3.10.0-1127.13.1.el7.x86_64 #1 SMP Tue Jun 23 15:46:38 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
$ iscsid --version
iscsid version

Crashed the kernel using sysrq to collect the crash dump and logs. Attached the vmcore dmesg along.

Steps To ReproduceMap multiple LUNs from mutiple iSCSI targets.
Continuously do IOs on each of the LUNs while flipping the iSCSI targets.
iscsid gets into D state




manager   ~0037387

CentOS is a rebuild of the sources used to create RHEL and aims to reproduce RHEL bug for bug and feature for feature. Please submit your request to Redhat via and if/when RH accepts it and incorporates it into RHEL and releases a patched version, then CentOS will pick it up automatically.
For easier tracking, please crosslink this bug with the one opened at

