View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0006476||CentOS-5||xen||public||2013-05-27 16:12||2013-10-09 06:50|
|Target Version||Fixed in Version|
|Summary||0006476: Frag is bigger than frame.|
|Description||a recent xen kernel update causes dom0 to emit "Frag is bigger than frame." and then disconnect the network from the domu. this appears to be caused by tcp offload generating frames that are too big. there is a thread at:|
dom0 emits logs like:
vif14.0: Frag is bigger than frame.
vif14.0: fatal error; disabling device
i think i have seen some domu messages, but the last instance of this didn't cause any logs.
a reboot of domu will hang during the ifdown phase (maybe trying to flush buffers?). doing a xm network-detach from dom0 will allow a clean shutdown, you can even attach a new vif via network-attach.
supposedly, disabling offload functions will fix this. i have used:
ethtool -K eth0 tx off tso off sg off
which might be overkill, but seems to have fixed the issue.
i have yet to be able to come up with a test case that will trigger this. i have only seen this on 3 out of 20 vms.
dom0 and domu are running 2.6.18-348.6.1.el5xen, 64bit. everything up to date.
i don't have an rhel system to test on, but i would guess this is an upstream issue.
|Steps To Reproduce||haven't been able to find a specific trigger. i've tried writing code to generate large packets, with so_sndbuf set large as well. so far just normal web server load will cause the problem once a month or so.|
|Tags||No tags attached.|
|I'm also experiencing this, there is a patch set here: http://lists.xen.org/archives/html/xen-devel/2013-04/msg02118.html|
We have seen this problem as well.
Jun 3 15:50:37 box38 kernel: vif1.0: Frag is bigger than frame.
Jun 3 15:50:37 box38 kernel: vif1.0: fatal error; disabling device
(Running a custom built kernel on the domU)
I have run into the same issues.
Does anyone know what the timeline is to fix this problem without using a custom patch or hack?
I have run into the same issue running
Debian Wheezy 3.2.0-4-amd6
stock 2.6.32-358.18.1.el6.x86_64 kernel
Should I create a different bug report? (since this one is CentOS 5)