2017-12-16 20:39 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0013993CentOS-7Cloud-Imagespublic2017-11-07 11:07
ReporterStuebi 
PrioritynormalSeverityblockReproducibilityrandom
StatusnewResolutionopen 
PlatformVMOSCentOSOS Version7.4
Product Version7.4.1708 
Target VersionFixed in Version 
Summary0013993: cloud-init wait for waagent on Azure CentOS 7.4 - no sshd start
DescriptionHello,
after update a CentOS 7.3-VM on Azure to CentOS 7.4, you can not connet via ssh because cloud-init try to start the waagent and the boot process hang. So sshd is stopped.

We install a fresh CentOS 7.4 in the Azure cloud to provide a base image template for our company and this will also happens in this VM.

#######
# yum info cloud-init:
Name : cloud-init
Arch : x86_64
Version : 0.7.9
Release : 9.el7.centos.2
Size : 2.1 M
Repo : installed
From repo : base

In CentOS 7.3 the cloud-init version is 0.7.5-10.el7.centos.1
waagent is Package-Version 2.2.14-1.el7 in both CentOS versions witch is internal updated to 2.2.17 from waagent it self.

#######
I didn't know why the system hang.
Can you please review this.
Steps To ReproduceTo debug the failure I had to install rlogin before update:

yum remove firewalld -y
yum install rsh-server -y
systemctl enable rsh.socket
systemctl enable rlogin.socket
systemctl enable rexec.socket
echo "root:123" | chpasswd
echo "+ root" > ~/.rlogin
cat << EOF >> /etc/securetty
rsh
rexec
rlogin
EOF

reboot

yum update -y
reboot

#######
to unblock the process I have connect via rlogin and kill the waagent start:

# ps -ef | grep "waagent\|cloud"
root 993 1 0 14:52 ? 00:00:02 /usr/bin/python /usr/bin/cloud-init init
root 1134 993 0 14:52 ? 00:00:00 /bin/systemctl start waagent.service
root 1337 1222 0 15:56 pts/2 00:00:00 grep --color=auto waagent\|cloud

# kill 1134

Then cloud-init do magic and on the next reboot sshd start without any trouble.

#######
To fail the VM again you can clear the config and reboot:
yum remove cloud-init WALinuxAgent -y
rm -f /etc/waagent.con*
rm -fr /etc/cloud/
rm -fr /var/lib/cloud/
rm -fr /var/lib/waagent/
rm -fr /var/log/waagent.lo*
rm -fr /var/log/cloud-init*
yum install cloud-init WALinuxAgent -y

cp -a /etc/waagent.conf /etc/waagent.conf.rpmsave
sed -i -e "s/Provisioning.Enabled.*/Provisioning.Enabled=n/g" /etc/waagent.conf
sed -i -e "s/Provisioning.UseCloudInit.*/Provisioning.UseCloudInit=y/g" /etc/waagent.conf
sed -i -e "s/Logs.Verbose.*/Logs.Verbose=y/g" /etc/waagent.conf

cp -a /etc/cloud/cloud.cfg /etc/cloud/cloud.cfg.rpmsave
cat << EOF >> /etc/cloud/cloud.cfg

# From cloud-init docs
datasource:
  Azure:
    agent_command: [service, waagent, start]

debug:
  verbose: True

EOF

diff /etc/waagent.conf.rpmsave /etc/waagent.conf
diff /etc/cloud/cloud.cfg.rpmsave /etc/cloud/cloud.cfg

reboot
TagsNo tags attached.
abrt_hash
URL
Attached Files

-Relationships
+Relationships

-Notes

~0030336

Stuebi (reporter)

Cloud-Init-Bug: https://bugs.launchpad.net/cloud-init/+bug/1720160
WAAgent-Bug: https://github.com/Azure/WALinuxAgent/issues/902

~0030337

toracat (manager)

Maybe related: https://access.redhat.com/solutions/3018621

~0030338

Stuebi (reporter)

sorry, but

grep -i ordering /var/log/messages

did not print any output

~0030531

Stuebi (reporter)

Hello,
after trying with WALinuxAgent 2.2.18 and the following Config, it works for me. Cloud-Init finds Azure as datasource, WALinuxAgent starts and give Azure a running state of the VM. So the ticket can be closed.
/etc/waagent.conf
    Provisioning.Enabled=n

/etc/cloud/cloud.cfg
    -> uncanged

systemctl enable waagent
+Notes

-Issue History
Date Modified Username Field Change
2017-10-09 08:53 Stuebi New Issue
2017-10-09 08:53 Stuebi File Added: cloud-init_afterkillwaagentstart_andreboot.tar
2017-10-09 08:53 Stuebi File Added: cloud-init_hangs.tar
2017-10-09 08:53 Stuebi File Added: cloud-init_afterkillwaagentstart.tar
2017-10-09 08:59 Stuebi Note Added: 0030336
2017-10-09 11:49 toracat Note Added: 0030337
2017-10-09 13:12 Stuebi Note Added: 0030338
2017-11-07 11:07 Stuebi Note Added: 0030531
+Issue History