View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0015695||CentOS-7||systemd||public||2019-01-11 06:33||2019-03-28 08:04|
|Platform||alibaba cloud||OS||centos||OS Version||7.4|
|Target Version||Fixed in Version|
|Summary||0015695: runc hang on systemd dbus invoke when systemd cgroup driver|
|Description||Yesterday one node of my kubernetes cluster became notready. ps -ef showed some docker-runc processes had been running many days|
root 26579 1303 0 2018 ? 00:00:00 docker-runc --systemd-cgroup=true events --stats c29996ea9566f16616505e7118315635582714308564ba0d9a70f8fb8cf73f0a
root 27841 2913 0 2018 ? 00:00:00 docker-runc --systemd-cgroup=true kill --all 8561b78c9cb19c0d883e30eafc8ff41ddf3007043985271386ffdbafc24d4376 SIGKILL
root 28293 1303 0 2018 ? 00:00:00 docker-runc --systemd-cgroup=true delete 25660e4c1f66593ec33ae57823def641a4c4a9ae1a7c6840afd081961b66e66e
After some investigation, I found docker-runc hang when calling systemd.UseSystemd. Below is the stack.
In fact, any dbus method call send to org.freedesktop.systemd1 was not responsed, for example, the below command would wait forever:
dbus-send --system --dest=org.freedesktop.systemd1 --type=method_call --print-reply /org/freedesktop/systemd1 org.freedesktop.DBus.Introspectable.Introspect
Also there were many systemd errors in /var/log/messages:
Jan 4 11:56:31 host-k8s-node001 systemd: Failed to propagate agent release message: Operation not supported
busctl tree reported Failed to introspect object / of service org.freedesktop.systemd1: Connection timed out
Resolved by restarting systemd: systemctl daemon-reexec
more stack info ref: https://github.com/opencontainers/runc/issues/1959
|Steps To Reproduce||I can not reproduce it by many runc operations. But I get this issue several times on my production environment。|
|Tags||No tags attached.|
This issue fixed by https://github.com/systemd/systemd/pull/11818 in systemd upstream.
Will the centos embedded systemd cherry-pick this fix? and witch version will resolve this?
*CentOS* doesn't cherrypick it at all. Redhat will need to do that for RHEL and, once they have and have released the patched version, then CentOS will rebuild and release it. I would suggest raising a ticket on bugzilla.redhat.com to see if they will backport it to the RHEL systemd.
Also, 7.4 is 2 point releases and nearly two years out of date. yum update.
|@TrevorH Thanks, I submitted an issue to rhel buglist.|