View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0015393 | CentOS-7 | kernel | public | 2018-10-19 22:34 | 2019-09-23 17:35 |
Reporter | tylerni7 | ||||
Priority | normal | Severity | minor | Reproducibility | always |
Status | acknowledged | Resolution | open | ||
Product Version | 7.5.1804 | ||||
Target Version | Fixed in Version | ||||
Summary | 0015393: Unshare of network namespaces slows down | ||||
Description | Running unshare with CLONE_NEWNET does not cause issues when run only occasionally. However, running unshare with CLONE_NEWNET frequently causes subsequent calls to take very long, up to several seconds or even a minute. As far as I can tell this was fixed in more recent kernels, but I don't know which commit or version introduced the fix. This can be especially bad on systems with many cores, as there seems to be one lock held for freeing up the allocated namespaces/interfaces which blocks creation of subsequent namespaces. | ||||
Steps To Reproduce | An easy way to observe this problem is to call unshare -n in a tight loop. While calling unshare -n alone might take 2-50 ms to complete in normal operation, when called repeatedly in a single thread, this quickly jumps up to about 500ms. A simple script such as time sudo bash -c 'for i in $(seq 100); do unshare -n echo -n ; done' takes about 45 seconds to complete on my machine. To verify the slowdown is related to the kernel unshare syscall, one can run strace on unshare while the above is running: $ sudo strace -Te trace=unshare unshare -n echo -n unshare(CLONE_NEWNET) = 0 <1.214871> +++ exited with 0 +++ | ||||
Additional Information | This is an issue that was fixed on more recent kernels, as best I can tell. Another machine running Debian on identical hardware but with kernel 4.9.0 takes about 4 seconds rather than 45 to run the above shell script. On machines with many cores each running unshare, the time jumps up from a few hundred milliseconds on average to several seconds on average, when network namespaces are under heavy load, this jumps up to tens of seconds and up. | ||||
Tags | No tags attached. | ||||
abrt_hash | |||||
URL | |||||
@tylerni7 You can test more recent kernels on CentOS by using ELRepo's kernel-lt (4.4) or kernel-ml (4.18). http://elrepo.org/linux/kernel/el7/x86_64/RPMS/ Mind giving them a try? |
|
And if you want to get it fixed in the CentOS/RHEL kernels then you'll also need to report this on bugzilla.redhat.com. CentOS only rebuilds what is used to build RHEL so fixes have to come from upstream. | |
Sounds good. I'll try to test with those kernels and get back to you. Is it worth keeping this report open and also filing a report on bugzilla.redhat.com? Given that I only have CentOS to test, I wasn't sure if there were things I wouldn't be able to test that I would be asked about if I filed there. |
|
@tylerni7 If we can identify the patch(es)/commits that fix the issue and if they can be backported to the current kernel, CentOS will offer the centosplus kernel that includes the patch(es). Then you can provide the info upstream (RH) for them to consider. |
|
Using the ElRepo kernel-lt (specifically 4.4.161-1.el7.elrepo.x86_64 ) the issue seems to be gone. $ time sudo bash -c 'for i in $(seq 100); do unshare -n echo -n ; done' real 0m5.456s Not sure what the best way to track down the patch(es)/commit(s), but I guess that's a start. |
|
@tylerni7, I've been trying to follow the patches, but there are too many changes related to netfilter. If you could test with older versions of 4.4, it would be a good way to limit the amount of changes. You can get older elrepo 4.4.x kernels from here http://mirrors.coreix.net/elrepo-archive-archive/kernel/el7/x86_64/RPMS/, so you don't have to rebuild. Thanks! |
|
Also, I've been trying to replicate this bug locally and I couldn't, 3.10.0-862.14.4 is even faster than 4.4.161-1.el7.elrepo.x86_64. Please tell us more about your system to help us replicate the problem. Thanks. |
|
Hm gross. I'll try to see if I can find a good way to reproduce the issue on a standard environment. I can verify that what I was doing does not reproduce on fresh machine. I don't know if it only manifests itself after the machine gets into a particular state first or something else. The machines I was testing on are in use, which may be triggering the issue. I know there have been a number of previous(?) kernel issues related to this when I was researching the issue which might be relevant. One is a race condition with clone(CLONE_NEWNET) which once triggered prevents clone(CLONE_NEWNET) from succeeding until the machine reboots https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1021471 (I know that's the Ubuntu issue tracker, but seems like just a kernel issue). The other seems to be likely the same issue, http://lkml.iu.edu/hypermail/linux/kernel/1704.2/04491.html which has some stuff about reproducing the issue and more, so I'll try to look further into that and report back. |
|
Just a quick update: still having some issues reproducing this nicely. When we hit issues locally the machine is running Kubernetes which has set up a bunch of things that might somehow affect the performance. On newer kernels (under slightly different circumstances, and also still need to figure out what "newer" means, but at least 4.15) the issue can still happen where unshare calls take several seconds to complete, but by setting RCU_EXPEDITED=1 the calls go back to taking milliseconds. I will try to come up with a better way to reproduce the issue and post an update when that happens. |
|
@tylerni7 I'm doing some cleanup here. Are you still seeing this, even without a specific way to replicate it? |
|
We were still seeing this. Eventually we moved to using a kernel >= 4.17 from ELRepo , which has https://github.com/torvalds/linux/commit/1a57feb847c56d6193f67d0e892c24e71f9e3ab1 in it, which seems to help quite a bit. It seems difficult to reproduce easily because something needs to start causing contention before everything starts slowing down, so a fresh system may not hit it. Anyway, since we've worked around this, feel free to do whatever you think is appropriate for the ticket. Cheers! |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2018-10-19 22:34 | tylerni7 | New Issue | |
2018-10-19 22:43 | toracat | Note Added: 0032947 | |
2018-10-19 22:56 | TrevorH | Note Added: 0032948 | |
2018-10-19 23:09 | tylerni7 | Note Added: 0032949 | |
2018-10-19 23:16 | toracat | Note Added: 0032950 | |
2018-10-20 00:13 | tylerni7 | Note Added: 0032951 | |
2018-10-20 13:58 | toracat | Status | new => acknowledged |
2018-10-20 14:09 | pgreco | Note Added: 0032952 | |
2018-10-22 10:56 | pgreco | Note Added: 0032963 | |
2018-10-22 18:40 | tylerni7 | Note Added: 0032964 | |
2018-10-26 20:19 | tylerni7 | Note Added: 0033008 | |
2019-09-22 15:24 | pgreco | Note Added: 0035192 | |
2019-09-23 17:35 | tylerni7 | Note Added: 0035206 |