View Issue Details

IDProjectCategoryView StatusLast Update
0006733CentOS-6qemu-kvmpublic2013-11-12 08:44
Reportervali_dragnuta 
PriorityhighSeveritymajorReproducibilitysometimes
Status newResolutionopen 
PlatformCentosOSCentosOS Version6.4
Product Version6.4 
Target VersionFixed in Version 
Summary0006733: qcow2 images can become corrupted if host crashes,leading to full data loss/unavailability
DescriptionIn case of power failure disk images that were active and created in qcow2 format can become logically corrupt so that they actually appear as unused (full of zeroes).
Data seems to be there, but at this moment i cannot find any reliable method to recover it. Should it be a raw image, a recovery path would be available, but a qcow2 image only presents zeroes once it gets corrupted. My understanding is that the blockmap of the image gets reset and the image is then assumed to be unused.
My detailed setup :

Kernel 2.6.32-358.18.1.el6.x86_64
qemu-kvm-0.12.1.2-2.355.0.1.el6.centos.7.x86_64
Used via libvirt libvirt-0.10.2-18.el6_4.14.x86_64
The image was used from a NFS share (the nfs server did NOT crash and remained permanently active).

qemu-img check finds no corruption;
qemu-img convert will fully convert the image to raw at a raw image full of zeroes. However, there is data in the file, and the storage backend was not restarted, inactivated during the incident.
I encountered this issue on two different machines, in both cases i was not able to recover the data.
Image was qcow2, thin provisioned, created like this :
 qemu-img create -f qcow2 -o cluster_size=2M imagename.img

While addressing the root cause in order to not have this issue repeat would be the ideal scenario, a temporary workaround to run on the affected qcow2 image to "patch" it and recover the data (eventually after a full fsck/recovery inside the guest) would also be good. Otherwise we are basically losing data on a large scale when using qcow2.

 
Steps To ReproduceUnfortunately i cannot reproduce this, but it seems to happen quite frequently, it is the second time in 1 month this happens.
Additional Information- I can, and I want to also file a bugreport to the upstream. Please suggest me where should I file it. Is it a kernel bug ? a qcow2 format weakness ? What would be the proper channel to address a kind requirement for someone to at least have a look at sample images and try to make image patcher to restore the old blockmap ?

-I can provide at least one image having the described symptoms, maybe two
-As this issue leads to data loss without an apparent way to recover from the situation, I am marking this issue with "major" and priority "high". If you feel this is inappropriate, please change it and help me direct this issue to the relevant upstream.

TagsNo tags attached.

Activities

wolfy

wolfy

2013-11-12 08:44

developer   ~0018333

upstream bug https://bugzilla.redhat.com/show_bug.cgi?id=1029344

Issue History

Date Modified Username Field Change
2013-11-11 19:29 vali_dragnuta New Issue
2013-11-12 08:44 wolfy Note Added: 0018333