View Issue Details

IDProjectCategoryView StatusLast Update
0015162CentOS-7dlmpublic2018-09-20 15:25
Reportervnick 
PrioritynormalSeveritycrashReproducibilityalways
Status newResolutionopen 
PlatformVMware x86-64OSCentOS-7OS Version7.5.1804
Product Version7.5.1804 
Target VersionFixed in Version 
Summary0015162: Soft Lockup with DLM over SCTP on Multihomed Host
DescriptionI'm running a cluster configuration on a pair of multi-homed virtual machines. I've used corosync, pacemaker, and pcs to set this up. Each of the VMs has three network interfaces - two of them are connected directly to each other through separate physical 10GbE links, and the third is a management interface.

When setting up the cluster with pcs I'm using the option to specify multiple network paths for each node. I then create the the two resources - dlm and clvmd. I'm able to successfully start dlm_controld, but when I try to start clvmd, I get the "CPU1 Bug Soft Lockup" message and the console or SSH session becomes unresponsive.
Steps To Reproduce1. Install all of the cluster prereqs and configure networking
2. pcs cluster setup --start --name cluster node1-int1,node1-int2 node2-int1,node2-int2
3. pcs resource create dlm ocf:pacemaker:controld op monitor interval=30s on-fail=fence clone interleave=true ordered=true
4. pcs resource create clvmd ocf:heartbeat:clvm op monitor interval=30s on-fail=fence clone interleave=true ordered=true
5. pcs constraint order start dlm-clone then clvmd-clone
6. pcs constraint colocation add clvmd-clone with dlm-clone
7. (node1) pcs resource debug-start dlm
8. (node2) pcs resource debug-start dlm
9. (node1) pcs resource debug-start clvmd
Additional Informationdmesg output:

[106764.704962] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [clvmd:46431]
[106764.705889] Modules linked in: sctp drbd_transport_tcp(OE) drbd(OE) dlm nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter vmw_vsock_vmci_transport vsock sunrpc sb_edac iosf_mbi crc32_pclmul ghash_clmulni_intel aesni_intel ppdev lrw gf128mul glue_helper ablk_helper cryptd vmw_balloon pcspkr joydev i2c_piix4 sg vmw_vmci shpchp parport_pc parport ip_tables xfs libcrc32c sr_mod cdrom
[106764.705930] sd_mod crc_t10dif crct10dif_generic ata_generic pata_acpi vmwgfx drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm crct10dif_pclmul crct10dif_common crc32c_intel drm mptspi scsi_transport_spi mptscsih ata_piix libata serio_raw mptbase vmxnet3 i2c_core dm_mirror dm_region_hash dm_log dm_mod
[106764.705946] CPU: 0 PID: 46431 Comm: clvmd Tainted: G OEL ------------ 3.10.0-862.9.1.el7.x86_64 #1
[106764.705947] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 09/21/2015
[106764.705949] task: ffff9660e6f9cf10 ti: ffff96603e448000 task.ti: ffff96603e448000
[106764.705950] RIP: 0010:[<ffffffff8655b1e9>] [<ffffffff8655b1e9>] __write_lock_failed+0x9/0x20
[106764.705957] RSP: 0018:ffff96603e44bd88 EFLAGS: 00000297
[106764.705958] RAX: ffff96603e44bfd8 RBX: ffff96603e44bd30 RCX: 0000000000000000
[106764.705959] RDX: ffff96603e548000 RSI: 0000000000000200 RDI: ffff96603e5481ac
[106764.705960] RBP: ffff96603e44bd88 R08: 0000000000000004 R09: 0000000000000000
[106764.705961] R10: ffff96603eef4180 R11: d2057d5ef9a07b65 R12: ffff96603e44bd58
[106764.705963] R13: ffff96603e44bdcc R14: 0000000000000004 R15: 0000000000000000
[106764.705964] FS: 00007f80a4d17880(0000) GS:ffff9660f9600000(0000) knlGS:0000000000000000
[106764.705966] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[106764.705984] CR2: 00007f80a2e0efe0 CR3: 00000000acf84000 CR4: 00000000003607f0
[106764.705989] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[106764.705990] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[106764.705991] Call Trace:
[106764.705997] [<ffffffff869166ba>] _raw_write_lock_bh+0x2a/0x30
[106764.706004] [<ffffffffc06a381e>] save_listen_callbacks.isra.5+0x1e/0x70 [dlm]
[106764.706008] [<ffffffffc06a4ee2>] dlm_lowcomms_start+0x462/0x580 [dlm]
[106764.706012] [<ffffffffc06a033b>] dlm_new_lockspace+0x10b/0x170 [dlm]
[106764.706016] [<ffffffffc06a9f1f>] device_write+0x38f/0x770 [dlm]
[106764.706020] [<ffffffff8641b490>] vfs_write+0xc0/0x1f0
[106764.706022] [<ffffffff8641c2bf>] SyS_write+0x7f/0xf0
[106764.706026] [<ffffffff86920795>] system_call_fastpath+0x1c/0x21
[106764.706027] Code: 00 00 e9 03 00 00 00 41 ff e7 e8 07 00 00 00 f3 90 0f ae e8 eb f9 4c 89 3c 24 c3 90 90 90 90 90 90 90 55 48 89 e5 f0 ff 07 f3 90 <83> 3f 01 75 f9 f0 ff 0f 75 f1 5d c3 90 66 2e 0f 1f 84 00 00 00
Tags3.10.0-862.9.1.el7, 7.5, cluster, clvmd
abrt_hash
URL

Activities

dragle

dragle

2018-09-13 17:08

reporter   ~0032717

I'm thinking this is the same as an issue I'm having. There are at least many similarities.

In my case, I'm not running VMs; just two physical servers. But like you I'm running DRBD + dlm.

In my case I start with a working cluster. The exact problem you describe starts as soon as I try to add redundant ring protocol to my corosync configuration. I.E., prior to my change my corosync.conf looks like this:

totem {
    version: 2
    cluster_name: MyCluster
    secauth: off
    transport: udpu
}

nodelist {
    node {
        ring0_addr: node1.mydomain.com
        nodeid: 1
    }

    node {
        ring0_addr: node2.mydomain.com
        nodeid: 2
    }
}

quorum {
    provider: corosync_votequorum
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
}

And that configuration appears to work fine, not having any trouble. But when I change it to look like this:

totem {
    version: 2
    cluster_name: MyCluster
    secauth: off
    transport: udpu
    rrp_mode: passive
}

nodelist {
    node {
        ring0_addr: node1.mydomain.com
        ring1_addr: node1lan.mydomain.com
        nodeid: 1
    }

    node {
        ring0_addr: node2.mydomain.com
        ring1_addr: node2lan.mydomain.com
        nodeid: 2
    }
}

quorum {
    provider: corosync_votequorum
    two_node: 1
}

logging {
    to_logfile: yes
    logfile: /var/log/cluster/corosync.log
    to_syslog: yes
}

node1/node2 and node1lan/node2lan resolve via /etc/hosts to two different private networks over two different NICs on each machine. Both networks connect and communicate ok, at least until the cluster starts.

After restarting my cluster I start getting the error you describe as soon as my filesystem mounts start (they never complete). DRBD and dlm start and appear to be ok. According to the message the CPU was stuck in mount and the mounts timeout. Over time I can see the load averages on both machines consistently climbing, with at least one CPU continuously spinning at 100%. Immediately when it happens I cannot SSH to the other node; and eventually the console on the node I am on becomes completely unresponsive.

The FS is GFS2. So far the only thing I've been able to do when it happens is hard power down the machines. And then when I remove the RRP stuph and restart the cluster all is well again.

It's worth noting that several weeks ago I tried setting up a dlm/lvmlockd configuration (with actual clustered volumes), and had the same problem. At the time I thought it more related to lvmlockd since my errors were reported as coming within it (same CPU#n stuck message). Since I didn't technically need the lvmlockd layer (in my situation I'm fine with just basing the DRBD volumes on an existing LVM volume) I reverted all that, including the RRP which I had tried at the same time. But now I'm wondering if it's something to do with RRP/dlm/GFS2 interactions.
dragle

dragle

2018-09-20 15:25

reporter   ~0032765

Not an answer, but FYI, per https://access.redhat.com/articles/3068921 (support contract required):

"dlm with RRP / SCTP: Red Hat does not support the usage of dlm or dlm-using components when dlm is configured to use SCTP communications, also known as "multi-homing". dlm automatically enables SCTP communications if the cluster is configured to use redundant rings (RRP) - meaning DLM is not supported in RRP clusters."

Issue History

Date Modified Username Field Change
2018-08-12 02:01 vnick New Issue
2018-08-12 02:01 vnick Tag Attached: 3.10.0-862.9.1.el7
2018-08-12 02:01 vnick Tag Attached: 7.5
2018-08-12 02:01 vnick Tag Attached: clvmd
2018-08-12 02:01 vnick Tag Attached: cluster
2018-09-13 17:08 dragle Note Added: 0032717
2018-09-20 15:25 dragle Note Added: 0032765