2017-08-18 01:13 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0013146CentOS-7opensshpublic2017-05-04 08:38
Reportermartini5468 
PriorityhighSeverityminorReproducibilityalways
StatusnewResolutionopen 
Product Version7.3.1611 
Target VersionFixed in Version 
Summary0013146: sshd.service fails start during boot if ListenAddress directive is used in /etc/ssh/sshd_config
DescriptionAfter installing updates[1] on CentOS7 and rebooting the server SSHD service failed to start with the following errors:

systemd[1]: Starting OpenSSH server daemon...
sshd[691]: error: Bind to port 22 on 192.168.16.55 failed: Cannot assign requested address.
sshd[691]: fatal: Cannot bind any address.
systemd[1]: sshd.service: main process exited, code=exited, status=255/n/a
systemd[1]: Failed to start OpenSSH server daemon.
systemd[1]: Unit sshd.service entered failed state.
systemd[1]: sshd.service failed.

After that logging through the console and restarting the SSH service fixes the issue.

Steps To Reproduce1. Insert `ListenAddress` directive in /etc/ssh/sshd_config
2. Restart machine
3. sshd.service is not started
Additional Information[1] List of packets updated that caused the issue:
Packages Altered:
    Updated NetworkManager-1:1.4.0-17.el7_3.x86_64 @updates
    Update 1:1.4.0-19.el7_3.x86_64 @updates
    Updated NetworkManager-libnm-1:1.4.0-17.el7_3.x86_64 @updates
    Update 1:1.4.0-19.el7_3.x86_64 @updates
    Updated NetworkManager-tui-1:1.4.0-17.el7_3.x86_64 @updates
    Update 1:1.4.0-19.el7_3.x86_64 @updates
    Updated NetworkManager-wifi-1:1.4.0-17.el7_3.x86_64 @updates
    Update 1:1.4.0-19.el7_3.x86_64 @updates
    Updated ca-certificates-2015.2.6-73.el7.noarch @base
    Update 2017.2.11-70.1.el7_3.noarch @updates
    Updated dmidecode-1:3.0-2.el7.x86_64 @base
    Update 1:3.0-2.1.el7_3.x86_64 @updates
    Updated grubby-8.28-18.el7.x86_64 @base
    Update 8.28-21.el7_3.x86_64 @updates
    Updated httpd-2.4.6-45.el7.centos.x86_64 @base
    Update 2.4.6-45.el7.centos.4.x86_64 @updates
    Updated httpd-tools-2.4.6-45.el7.centos.x86_64 @base
    Update 2.4.6-45.el7.centos.4.x86_64 @updates
    Updated initscripts-9.49.37-1.el7.x86_64 @base
    Update 9.49.37-1.el7_3.1.x86_64 @updates
    Updated irqbalance-3:1.0.7-6.el7.x86_64 @base
    Update 3:1.0.7-6.el7_3.1.x86_64 @updates
    Updated jenkins-2.54-1.1.noarch @jenkins
    Update 2.55-1.1.noarch @jenkins
    Erase kernel-3.10.0-327.el7.x86_64 @anaconda
    Install kernel-3.10.0-514.16.1.el7.x86_64 @updates
    Updated kernel-headers-3.10.0-514.10.2.el7.x86_64 @updates
    Update 3.10.0-514.16.1.el7.x86_64 @updates
    Updated kernel-tools-3.10.0-514.10.2.el7.x86_64 @updates
    Update 3.10.0-514.16.1.el7.x86_64 @updates
    Updated kernel-tools-libs-3.10.0-514.10.2.el7.x86_64 @updates
    Update 3.10.0-514.16.1.el7.x86_64 @updates
    Updated libblkid-2.23.2-33.el7.x86_64 @base
    Update 2.23.2-33.el7_3.2.x86_64 @updates
    Updated libgudev1-219-30.el7_3.7.x86_64 @updates
    Update 219-30.el7_3.8.x86_64 @updates
    Updated libmount-2.23.2-33.el7.x86_64 @base
    Update 2.23.2-33.el7_3.2.x86_64 @updates
    Updated libuuid-2.23.2-33.el7.x86_64 @base
    Update 2.23.2-33.el7_3.2.x86_64 @updates
    Updated mod_ssl-1:2.4.6-45.el7.centos.x86_64 @base
    Update 1:2.4.6-45.el7.centos.4.x86_64 @updates
    Updated openssh-6.6.1p1-33.el7_3.x86_64 @updates
    Update 6.6.1p1-35.el7_3.x86_64 @updates
    Updated openssh-clients-6.6.1p1-33.el7_3.x86_64 @updates
    Update 6.6.1p1-35.el7_3.x86_64 @updates
    Updated openssh-server-6.6.1p1-33.el7_3.x86_64 @updates
    Update 6.6.1p1-35.el7_3.x86_64 @updates
    Updated python-perf-3.10.0-514.10.2.el7.x86_64 @updates
    Update 3.10.0-514.16.1.el7.x86_64 @updates
    Updated systemd-219-30.el7_3.7.x86_64 @updates
    Update 219-30.el7_3.8.x86_64 @updates
    Updated systemd-libs-219-30.el7_3.7.x86_64 @updates
    Update 219-30.el7_3.8.x86_64 @updates
    Updated systemd-sysv-219-30.el7_3.7.x86_64 @updates
    Update 219-30.el7_3.8.x86_64 @updates
    Updated util-linux-2.23.2-33.el7.x86_64 @base
    Update 2.23.2-33.el7_3.2.x86_64 @updates
Scriptlet output:
   1 warning: /etc/sysconfig/jenkins created as /etc/sysconfig/jenkins.rpmnew
Tagssystemd
abrt_hash
URL
Attached Files

-Relationships
+Relationships

-Notes

~0029099

kabe (reporter)

I cannot reproduce this, unless setting ListenAddress to some bogus address.

- Do you have "Reached target Network is Online" line above the excerpt of
  the journalctl log?
- Is that 192.168.16.55 address statically or dynamically assigned?
  By NetworkManager? or from custom script?
- Do you have setup multiple sshd instances?
- After failing to start, what does "sudo ss -l -p --tcp | grep ssh" say?

~0029100

TrevorH (developer)

Latest version of openssh is openssh-6.6.1p1-35.el7_3.x86_64 and fixes a race condition in openssh startup via systemd. https://rhn.redhat.com/errata/RHBA-2017-0915.html

~0029101

martini5468 (reporter)

@kabe here are the answers:

- I do have "Reached target Network is Online". However looking at the full log I have a few more lines that might be helpful. First there is an error about NetworkManager:
systemd[1]: Dependency failed for Network Manager Wait Online.
systemd[1]: Job NetworkManager-wait-online.service/start failed with result 'dependency'.

Next I see the attempt to start OpenSSH server and the error:
systemd[1]: Starting OpenSSH server daemon...
...
systemd[1]: Failed to start OpenSSH server daemon

Further down I see the Network is Online target and the interface starting:

network[718]: Bringing up interface ens160: [ OK ]
...
systemd[1]: Reached target Network is Online.

Looks like systemd tries to bring up ssh server before the network interface is brought up.

- 192.168.16.55 is statically assigned in script /etc/sysconfig/network-scripts/ifcfg-ens160. NetworkManager is disabled.

- Single instance sshd

- After failing to start there is no sshd listening. The command shows no output.

~0029102

martini5468 (reporter)

@TrevorH thanks for that. I reviewed the link however the problem we have seems to be a bit different. ssh fails to start as it cannot bind the interface which is not brought up until later on in the boot sequence. See my previous note for more details.

~0029103

kabe (reporter)

I still could not reproduce the @martini5468 's failure mode.
(perhaps number of CPUs matter?)

The log "Dependency failed for Network Manager Wait Online." is normal when
NetworkManager is disabled, because it isn't.

Looking at the systemd unit sshd.service, it specifies dependency as
    After=network.target sshd-keygen.service

but by default sshd service comes up in order of

Apr 20 10:11:40 UEFI73.six.ten systemd[1]: Reached target Network.
Apr 20 10:11:40 UEFI73.six.ten systemd[1]: Starting Network.
...
Apr 20 10:11:40 UEFI73.six.ten systemd[1]: Starting OpenSSH server daemon...
Apr 20 10:11:40 UEFI73.six.ten systemd[1]: Reached target Network is Online.
Apr 20 10:11:40 UEFI73.six.ten systemd[1]: Starting Network is Online.

sshd is started before network-online.target,
which could be problematic on concurrent startup which systemd is good about.


So, how about adding After=network-online.target to dependency?
Add a new file /etc/systemd/system/sshd.service.d/online.conf with
====
[Unit]
After=network-online.target
=====

After this, sshd will come up after "Network is Online" target:

Apr 20 10:29:59 UEFI73.six.ten systemd[1]: Reached target Network.
Apr 20 10:29:59 UEFI73.six.ten systemd[1]: Starting Network.
...
Apr 20 10:29:59 UEFI73.six.ten systemd[1]: Reached target Network is Online.
Apr 20 10:29:59 UEFI73.six.ten systemd[1]: Starting Network is Online.
Apr 20 10:29:59 UEFI73.six.ten systemd[1]: Starting OpenSSH server daemon...

I agree that the default network.target dependency should work though.
(Maybe the NetworkManager-wait-online.service is useless without NetworkManager)

~0029143

Tjololo (reporter)

I've experienced this issue as well after an update today. Versions updated:

Updated openssh-6.6.1p1-33.el7_3.x86_64 @updates
Update 6.6.1p1-35.el7_3.x86_64 @updates
Updated openssh-clients-6.6.1p1-33.el7_3.x86_64 @updates
Update 6.6.1p1-35.el7_3.x86_64 @updates
Updated openssh-server-6.6.1p1-33.el7_3.x86_64 @updates
Update 6.6.1p1-35.el7_3.x86_64 @updates

Current version:
# ssh -V
OpenSSH_6.6.1p1, OpenSSL 1.0.1e-fips 11 Feb 2013

Looking at my boot.log, I can see that openSSH does indeed attempt to start before network:

[^[[32m OK ^[[0m] Started Login Service.
[^[[1;31mFAILED^[[0m] Failed to start OpenSSH server daemon.
See 'systemctl status sshd.service' for details.
[^[[32m OK ^[[0m] Started Network Time Service.
         Starting Terminate Plymouth Boot Screen...
[^[[32m OK ^[[0m] Started Command Scheduler.
         Starting Command Scheduler...
         Starting Wait for Plymouth Boot Screen to Quit...
[^[[32m OK ^[[0m] Started IPv4 firewall with iptables.
         Starting LSB: Bring up/down networking...

looking at journalctl, I can see this failure is because the address is not assignable:

Apr 24 13:10:26 <hostname removed> sshd[1315]: error: Bind to port 22 on <IP removed> failed: Cannot assign requested address.

even though the sshd service is configured to start after network.target:

[Unit]
Description=OpenSSH server daemon
Documentation=man:sshd(8) man:sshd_config(5)
After=network.target sshd-keygen.service
Wants=sshd-keygen.service

can confirm that commenting out the ListenAddress directive in /etc/ssh/sshd_config resolves the issue, however it does so by not requiring binding to a port, NOT by making SSH start after the network:

[^[[32m OK ^[[0m] Started OpenSSH server daemon.
[^[[32m OK ^[[0m] Started System Logging Service.
<snip>
[^[[32m OK ^[[0m] Started Dynamic System Tuning Daemon.
[^[[32m OK ^[[0m] Started Login Service.
         Starting LSB: Bring up/down networking...

If I can provide any other information to help diagnose, please let me know.

~0029144

Tjololo (reporter)

I've experienced this issue as well after an update today. Versions updated:

Updated openssh-6.6.1p1-33.el7_3.x86_64 @updates
Update 6.6.1p1-35.el7_3.x86_64 @updates
Updated openssh-clients-6.6.1p1-33.el7_3.x86_64 @updates
Update 6.6.1p1-35.el7_3.x86_64 @updates
Updated openssh-server-6.6.1p1-33.el7_3.x86_64 @updates
Update 6.6.1p1-35.el7_3.x86_64 @updates

Current version:
# ssh -V
OpenSSH_6.6.1p1, OpenSSL 1.0.1e-fips 11 Feb 2013

Looking at my boot.log, I can see that openSSH does indeed attempt to start before network:

[^[[32m OK ^[[0m] Started Login Service.
[^[[1;31mFAILED^[[0m] Failed to start OpenSSH server daemon.
See 'systemctl status sshd.service' for details.
[^[[32m OK ^[[0m] Started Network Time Service.
         Starting Terminate Plymouth Boot Screen...
[^[[32m OK ^[[0m] Started Command Scheduler.
         Starting Command Scheduler...
         Starting Wait for Plymouth Boot Screen to Quit...
[^[[32m OK ^[[0m] Started IPv4 firewall with iptables.
         Starting LSB: Bring up/down networking...

looking at journalctl, I can see this failure is because the address is not assignable:

Apr 24 13:10:26 <hostname removed> sshd[1315]: error: Bind to port 22 on <IP removed> failed: Cannot assign requested address.

even though the sshd service is configured to start after network.target:

[Unit]
Description=OpenSSH server daemon
Documentation=man:sshd(8) man:sshd_config(5)
After=network.target sshd-keygen.service
Wants=sshd-keygen.service

can confirm that commenting out the ListenAddress directive in /etc/ssh/sshd_config resolves the issue, however it does so by not requiring binding to a port, NOT by making SSH start after the network:

[^[[32m OK ^[[0m] Started OpenSSH server daemon.
[^[[32m OK ^[[0m] Started System Logging Service.
<snip>
[^[[32m OK ^[[0m] Started Dynamic System Tuning Daemon.
[^[[32m OK ^[[0m] Started Login Service.
         Starting LSB: Bring up/down networking...

If I can provide any other information to help diagnose, please let me know.

~0029145

Tjololo (reporter)

I apologize for the double comment, I don't know how to delete it.

~0029163

martini5468 (reporter)

@Tjololo - if you wish to keep your ListenAddress directive you can use the systemd unit override as mentioned by @kabe in his last note.

@kabe - do you reckon this is a legitimate problem with systemd and unit ordering or is it down to misconfiguration in sshd_conf? Our sshd conf sets a lot of options that may or may not be needed. I can post a clean version here if it will help replicating the problem?

M

~0029168

Tjololo (reporter)

@martini5468 - I've managed to solve the problem temporarily by disabling the ListenAddress directive and using firewall rules to only allow incoming SSH to the proper interface. This was more to attempt to help with the bug fix; I'm not sure where the bug lies, but something definitely changed between the last update and the current one. If that is the solution, perhaps the changed systemd unit file should come with the update?

~0029170

TrevorH (developer)

I think it more likely that you need to add LINKDELAY= or NETWORKDELAY= into the relevant ifcfg file so that it waits for the link to become available before it tries to bring it up.

~0029173

kabe (reporter)

@martini5468 - I don't think sshd_config is the problem. The systemd ordering is the problem.
It's just that systemd units aren't considering setup without NetwotkManager.
That is, if you disable NetworkManager, now it's on your own to troubleshoot.
I guess there's a lot more systemd units wich will malfunction
without NetworkManager, looking at the bootup sequence of a slow machine.

~0029224

orodbhen (reporter)

Just updated yesterday, and seeing the same issue. There's a suggested workaround here:

https://bugzilla.redhat.com/show_bug.cgi?id=1438749

Supposedly there's an upstream fix as well.

~0029226

martini5468 (reporter)

@orodbhen - thanks for the link. The suggested workaround from there does the trick.
+Notes

-Issue History
Date Modified Username Field Change
2017-04-18 12:42 martini5468 New Issue
2017-04-18 12:42 martini5468 Tag Attached: systemd
2017-04-19 05:21 kabe Note Added: 0029099
2017-04-19 08:13 TrevorH Note Added: 0029100
2017-04-19 08:58 martini5468 Note Added: 0029101
2017-04-19 09:08 martini5468 Note Added: 0029102
2017-04-20 01:36 kabe Note Added: 0029103
2017-04-24 19:25 Tjololo Note Added: 0029143
2017-04-24 19:26 Tjololo Note Added: 0029144
2017-04-24 19:37 Tjololo Note Added: 0029145
2017-04-27 08:30 martini5468 Note Added: 0029163
2017-04-27 12:55 Tjololo Note Added: 0029168
2017-04-27 13:58 TrevorH Note Added: 0029170
2017-04-28 00:21 kabe Note Added: 0029173
2017-05-03 16:18 orodbhen Note Added: 0029224
2017-05-04 08:38 martini5468 Note Added: 0029226
+Issue History