2018-01-23 17:48 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0013836CentOS-7Cloud-Imagespublic2017-12-27 11:02
Reporterfmbiete 
PriorityhighSeveritymajorReproducibilityalways
StatusnewResolutionopen 
PlatformAmazon Web ServicesOSCentOSOS Version7.4
Product Version 
Target VersionFixed in Version 
Summary0013836: RHBA-2017:2283 - AWS AMI rebuild with ENA support
DescriptionWe need to rebuild the CentOS 7 AWS AMI to include the latest fixes and improvements from upstream:

https://access.redhat.com/errata/RHBA-2017:2283
https://bugzilla.redhat.com/show_bug.cgi?id=1410047

The changes from 7.3 are not included either.
TagsNo tags attached.
abrt_hash
URL
Attached Files

-Relationships
+Relationships

-Notes

~0030373

fmbiete (reporter)

This only requires adding 2 flags to the AMI build process, when registering the image into AWS.

--sriov-net-support simple --ena-support

No change in the kickstart part is required

Example:

aws ec2 register-image --name 'CentOS-7 HVM with updates' --description 'fixed networking support' --virtualization-type hvm --root-device-name /dev/xvda1 --block-device-mappings '[{"DeviceName":"/dev/sda1","Ebs": { "SnapshotId": "snap-whatever", "VolumeSize":8, "DeleteOnTermination": false, "VolumeType": "gp2"}}]' --architecture x86_64 --sriov-net-support simple --ena-support

~0030775

fmbiete (reporter)

No activity in this ticket, but AMI 1708_11.01 includes those attributes (ENA & SRIOV support). I suppose someone noticed that CentOS AMI couldn't run in the new instance types. Is this bugzilla even read by the Cloud/Virt members?

{
            "ProductCodes": [
                {
                    "ProductCodeId": "aw0evgkw8e5c1q413zgy5pjce",
                    "ProductCodeType": "marketplace"
                }
            ],
            "Description": "CentOS Linux 7 x86_64 HVM EBS 1708_11.01",
            "VirtualizationType": "hvm",
            "Hypervisor": "xen",
            "ImageOwnerAlias": "aws-marketplace",
            "EnaSupport": true,
            "SriovNetSupport": "simple",
            "ImageId": "ami-192a9460",
            "State": "available",
            "BlockDeviceMappings": [
                {
                    "DeviceName": "/dev/sda1",
                    "Ebs": {
                        "Encrypted": false,
                        "DeleteOnTermination": false,
                        "VolumeType": "standard",
                        "VolumeSize": 8,
                        "SnapshotId": "snap-013406753fcf8e3df"
                    }
                }
            ],
            "Architecture": "x86_64",
            "ImageLocation": "aws-marketplace/CentOS Linux 7 x86_64 HVM EBS 1708_11.01-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-95096eef.4",
            "RootDeviceType": "ebs",
            "OwnerId": "679593333241",
            "RootDeviceName": "/dev/sda1",
            "CreationDate": "2017-12-05T14:49:45.000Z",
            "Public": true,
            "ImageType": "machine",
            "Name": "CentOS Linux 7 x86_64 HVM EBS 1708_11.01-b7ee8a69-ee97-4a49-9e68-afaee216db2e-ami-95096eef.4"
        },

~0030829

arronax (reporter)

Stumbled upon this issue while investigating ENA on the older AMIs, so I think it could be useful to update it. There are quite a lot of other issues on C5/M5 in the bug list, too, but this is the one popping up on `centos7 ena` search.


Short summary: some instances started with ami-192a9460 may show connectivity issues. Can be fixed by restart or updating network device names.


As fmbiete writes, ami-192a9460 does include necessary flags, and it even has everything for ENA in the system itself, but it MAY NOT properly work with newer instance types without some fixes.

Sometimes, if you spin up an m5 instance off ami-192a9460, it will start up (no complains on ENA from AWS) but you won't be able to connect to it. AWS EC2 console shows "Instance reachability check failed"

I started up a bunch of instances, most m5, and one c5, and hit the issue twice. Actually, I hit the issue on the first one I started, so that's why I looked into it. I later hit this again on 5th instance started.

I checked this a bit and was able to make ENA work on "bad" instance started from ami-192a9460, at least according to checks provided by AWS documentation.

Documentation is here: http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html

According to the doc, system might show connection issues if there's no eth0 device present. I changed instance type to m4 and confirmed device is named ens3. Renamed ens3 to eth0 and disabled Predictable Network Interface Names. After that, once I made sure that cloud-init is happy with new device name and generates ifcfg-eth0 properly, I changed instance type again to m5. Instance then started up and I was able to connect to it, and it generally seems to be fine.

I also noted following: when m5 instance starts up properly, device name seems to always be ens5.

Later I found that restarting a faulty instance may also fix it, and device name is ens5 after that. This makes me believe that another restart might actually break this again, if device name changes. However, few restarts didn't show this behavior, so maybe I'm wrong.


It seems that updating AMI and setting default device name to eth0 would be the best way to prevent instances from being faulty in the first place.


Following steps were needed to make "bad" instance based on ami-192a9460 work properly:
* start instance as m4 or anything non-ENA
* add `net.ifnames=0` to `GRUB_CMDLINE_LINUX` line in `/etc/default/grub` and regenerate grub config with `grub2-mkconfig -o /boot/grub2/grub.cfg`
* update `/etc/udev/rules.d/70-persistent-net.rules` and rename device to eth0
* change instance type to m5 and start it
* check that ena is used as a driver for eth0: `ethtool -i eth0`

Note: not sure disabling predictable interface names is even necessary, but it's in the docs.
+Notes

-Issue History
Date Modified Username Field Change
2017-09-17 10:45 fmbiete New Issue
2017-10-14 10:45 fmbiete Note Added: 0030373
2017-12-18 08:40 fmbiete Note Added: 0030775
2017-12-27 11:02 arronax Note Added: 0030829
+Issue History