2017-07-25 14:43 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0012797CentOS-7kernelpublic2017-05-19 14:12
Reporterfc7 
PriorityhighSeveritymajorReproducibilityrandom
StatusnewResolutionopen 
Product Version7.3.1611 
Target VersionFixed in Version 
Summary0012797: Filesystem shutdown while creating backups on HyperV
DescriptionSince system upgrade from 7.2 to 7.3, while creating backups of the VM on the host during the start or at the end of the creation of a snapshot (usually at the end), the XFS filesystem is shutdown. This happens at completely random moments, not on every backup and it's always affecting the same volume which hosts a PostgreSQL DB.
Errors found:



To recover the filesystem it's enough to unmount it and mount it back. I executed xfs_repair to check the filesystem when this problem happens but it never detects any errors.

I found the following Ubuntu bug that matches this issue: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1456985


The problem started around 23/01/2017 when upgrading from 7.2 to 7.3 with kernel 3.10.0-514.2.2.el7.x86_64
I tried to update the kernel to 3.10.0-514.6.1.el7.x86_64 but the problem remains the same.
However I cannot reproduce the issue with kernel 3.10.0-327.28.2.el7.x86_64.
Steps To Reproduce1- Create a CentOS 7.3 VM on Hyper-V.
2- Schedule Hyper-V backups on the host.
3- Let the backup run.
4- Sometimes when the backup finishes and the snapshot of the VM is merged, the filesystem is shutdown due to errors.
Tagshyper-v, kernel, xfs
abrt_hash
URL
Attached Files

-Relationships
+Relationships

-Notes

~0028550

fc7 (reporter)

Errors from the logs:

Feb 8 01:30:01 server02 systemd: Stopping user-0.slice.
Feb 8 01:30:35 server02 journal: Hyper-V VSS: VSS: op=FREEZE: succeeded
Feb 8 01:30:35 server02 kernel: sd 0:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Feb 8 01:30:35 server02 kernel: sd 0:0:0:0: [storvsc] Add. Sense: Changed operating definition
Feb 8 01:30:35 server02 kernel: sd 0:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not
 automa
Feb 8 01:30:35 server02 journal: Hyper-V VSS: VSS: op=THAW: succeeded
Feb 8 01:30:35 server02 systemd: Time has been changed
Feb 8 01:30:37 server02 kernel: sd 0:0:0:1: [storvsc] Sense Key : Unit Attention [current]
Feb 8 01:30:37 server02 kernel: sd 0:0:0:1: [storvsc] Add. Sense: Changed operating definition
Feb 8 01:30:37 server02 kernel: sd 0:0:0:1: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 8 01:31:00 server02 kernel: sd 0:0:0:2: [storvsc] Sense Key : Unit Attention [current]
Feb 8 01:31:00 server02 kernel: sd 0:0:0:2: [storvsc] Add. Sense: Changed operating definition
Feb 8 01:31:00 server02 kernel: sd 0:0:0:2: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 8 01:40:01 server02 systemd: Created slice user-0.slice.
Feb 8 01:40:01 server02 systemd: Starting user-0.slice.
Feb 8 02:01:01 server02 systemd: Removed slice user-0.slice.
Feb 8 02:01:01 server02 systemd: Stopping user-0.slice.
Feb 8 02:02:56 server02 kernel: sd 0:0:0:1: [storvsc] Sense Key : Unit Attention [current]
Feb 8 02:02:56 server02 kernel: sd 0:0:0:1: [storvsc] Add. Sense: Changed operating definition
Feb 8 02:02:56 server02 kernel: sd 0:0:0:1: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 8 02:02:56 server02 kernel: blk_update_request: I/O error, dev sdb, sector 62972840
Feb 8 02:02:56 server02 kernel: XFS (sdb1): metadata I/O error: block 0x3c0dba8 ("xlog_iodone") error 5 numblks 64
Feb 8 02:02:56 server02 kernel: XFS (sdb1): xfs_do_force_shutdown(0x2) called from line 1203 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa015fb20
Feb 8 02:02:56 server02 kernel: XFS (sdb1): Log I/O Error Detected. Shutting down filesystem
Feb 8 02:02:56 server02 kernel: XFS (sdb1): Please umount the filesystem and rectify the problem(s)
Feb 8 02:02:56 server02 kernel: sd 0:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Feb 8 02:02:56 server02 kernel: sd 0:0:0:0: [storvsc] Add. Sense: Changed operating definition
Feb 8 02:02:56 server02 kernel: sd 0:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 8 02:03:10 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:03:40 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:04:10 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:04:18 server02 kernel: sd 0:0:0:2: [storvsc] Sense Key : Unit Attention [current]
Feb 8 02:04:18 server02 kernel: sd 0:0:0:2: [storvsc] Add. Sense: Changed operating definition
Feb 8 02:04:18 server02 kernel: sd 0:0:0:2: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 8 02:04:40 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:05:10 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:05:40 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:06:10 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:06:40 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:07:10 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:07:40 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:08:10 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:08:41 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:09:11 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:09:41 server02 kernel: XFS (sdb1): xfs_log_force: error -5 returned.
Feb 8 02:10:01 server02 systemd: Created slice user-1001.slice.
Feb 8 02:10:01 server02 systemd: Starting user-1001.slice.

~0028581

austinb (reporter)

I wanted to add to this that we are having this same issue during checkpoint merging but it does not happen all of the time. We currently run nightly backups of several Hyper-V VMs across a few physical hosts where a checkpoint is created and this is when the VMs sometimes show the issue. We can also cause the issue sometimes by just creating a new checkpoint and merging the checkpoint back. These VMs are all running CentOS 7.3 on Microsoft 2012 R2 hosts. We tried updating the LIS software from Microsoft to 4.1.3 but it appeared to make the issue worse so we removed it and went back to the CentOS Hyper-V packages.

# uname -a
# Linux hostname 3.10.0-514.6.1.el7.x86_64 #1 SMP Wed Jan 18 13:06:36 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Log from one server during backup window, we have several other hosts doing the same thing. The only solution to fix the problem is to reboot the linux VM.
Feb 13 20:42:32 hostname kernel: sd 2:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Feb 13 20:42:32 hostname kernel: sd 2:0:0:0: [storvsc] Add. Sense: Changed operating definition
Feb 13 20:42:32 hostname kernel: sd 2:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 13 20:42:36 hostname kernel: sd 3:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Feb 13 20:42:36 hostname kernel: sd 3:0:0:0: [storvsc] Add. Sense: Changed operating definition
Feb 13 20:42:36 hostname kernel: sd 3:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 13 20:42:46 hostname kernel: sd 3:0:0:1: [storvsc] Sense Key : Unit Attention [current]
Feb 13 20:42:46 hostname kernel: sd 3:0:0:1: [storvsc] Add. Sense: Changed operating definition
Feb 13 20:42:46 hostname kernel: sd 3:0:0:1: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 13 20:42:46 hostname kernel: blk_update_request: I/O error, dev sdc, sector 2148226632
Feb 13 20:42:46 hostname kernel: XFS (sdc1): metadata I/O error: block 0x800b4e48 ("xlog_iodone") error 5 numblks 64
Feb 13 20:42:46 hostname kernel: XFS (sdc1): xfs_do_force_shutdown(0x2) called from line 1203 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa0201b20
Feb 13 20:42:46 hostname kernel: XFS (sdc1): Log I/O Error Detected. Shutting down filesystem
Feb 13 20:42:46 hostname kernel: XFS (sdc1): Please umount the filesystem and rectify the problem(s)
Feb 13 20:42:46 hostname kernel: XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
Feb 13 20:42:46 hostname kernel: XFS (sdc1): xfs_qm_dquot_logitem_push: push error -5 on dqp ffff8802f2a2e130
Feb 13 20:43:16 hostname kernel: XFS (sdc1): xfs_log_force: error -5 returned.
Feb 13 20:43:46 hostname kernel: XFS (sdc1): xfs_log_force: error -5 returned.
Feb 13 20:44:16 hostname kernel: XFS (sdc1): xfs_log_force: error -5 returned.
Feb 13 20:44:46 hostname kernel: XFS (sdc1): xfs_log_force: error -5 returned.

~0028624

marcio (reporter)

Same problem since kernel 3.10.0-514.X
Host: Microsoft 2012 R2
Guest: CentOS 7.3 (3.10.0-514.6.1.el7.x86_64)
Lis: 4.1.3

on boot:
Feb 20 16:51:53 siaiap26 kernel: sd 2:0:0:0: [sda] 266338304 512-byte logical blocks: (136 GB/127 GiB)
Feb 20 16:51:53 siaiap26 kernel: sd 3:0:0:0: [sdb] 1048576000 512-byte logical blocks: (536 GB/500 GiB)
Feb 20 16:51:53 siaiap26 kernel: sd 3:0:0:0: [sdb] 4096-byte physical blocks
Feb 20 16:51:53 siaiap26 kernel: sd 2:0:0:0: [sda] Write Protect is off
Feb 20 16:51:53 siaiap26 kernel: sd 3:0:0:0: [sdb] Write Protect is off
Feb 20 16:51:53 siaiap26 kernel: sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 20 16:51:53 siaiap26 kernel: sd 2:0:0:0: [storvsc] Sense Key : Illegal Request [current]
Feb 20 16:51:53 siaiap26 kernel: sd 2:0:0:0: [storvsc] Add. Sense: Invalid command operation code
Feb 20 16:51:53 siaiap26 kernel: sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Feb 20 16:51:53 siaiap26 kernel: sd 3:0:0:0: [storvsc] Sense Key : Illegal Request [current]
Feb 20 16:51:53 siaiap26 kernel: sd 3:0:0:0: [storvsc] Add. Sense: Invalid command operation code

on error:
Feb 20 16:56:25 siaiap26 kernel: sd 2:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Feb 20 16:56:25 siaiap26 kernel: sd 2:0:0:0: [storvsc] Add. Sense: Changed operating definition
Feb 20 16:56:25 siaiap26 kernel: sd 2:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Feb 20 16:56:25 siaiap26 kernel: blk_update_request: I/O error, dev sda, sector 145024042
Feb 20 16:56:25 siaiap26 kernel: XFS (sda5): metadata I/O error: block 0x73bbc2a ("xlog_iodone") error 5 numblks 64
Feb 20 16:56:25 siaiap26 kernel: XFS (sda5): xfs_do_force_shutdown(0x2) called from line 1203 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa023bb20
Feb 20 16:56:25 siaiap26 kernel: XFS (sda5): Log I/O Error Detected. Shutting down filesystem
Feb 20 16:56:25 siaiap26 kernel: XFS (sda5): Please umount the filesystem and rectify the problem(s)
Feb 20 16:56:25 siaiap26 kernel: XFS (sda5): xfs_log_force: error -5 returned.
Feb 20 16:56:55 siaiap26 kernel: XFS (sda5): xfs_log_force: error -5 returned.

~0028685

marcio (reporter)

Same problem on new kernel 3.10.0-514.6.2.el7.x86_64

~0028788

marcio (reporter)

Same problem on new kernel 3.10.0-514.10.2.el7.x86_64

~0028889

centosviper (reporter)

Same behavior as described above. Last 2 month's kernel updates do not resolve the issue.

Replacing hyperv-daemons with Microsoft Linux Integration Services and experimenting with different versions had mixed results.

Some versions had 100% fail rate (filesystem shutdown with every backup), some others exhibit the same random behavior as built in hyperv-daemons.

~0028981

rikerben (reporter)

Unfortunately I have the same problem on new kernel 3.10.0-514.10.2.el7.x86_64.

~0028986

centosviper (reporter)

A reboot of the affected systems just before VM-Export, results to less frequent xfs shutdown.

~0029042

Lemahasta (reporter)

I've run into same error with hyper-v 2016 (windows server standard 2016 with hyper-v role). All centos 7.3 are affected, all with built-in LIS. Tried with various 3.10.0-514 kernel versions, error is random as in: connected to checkpoint create/merge but happens not every single time, rather more often than not though. After nightly backup usually at least 2 out of 6 are affected. Also once error popped after hyper-v replica was strated - as there's also checkpoint made at that pooint.

~0029164

ASPIRIN (reporter)

Same error on Windows Server 2016 Datacenter VM Gen2 version 8.0 CentOS 7.3 kernel 3.10.0-514.10.2.el7.x86_64 VM is Clustered, so disk backup from host is not a solution.
VM has hyperv-daemons installed with docker on board.
Backup by DPM 2016 begins on 1:00 AM. VM backed up by DPM`s new RTC feature.

Apr 27 01:06:38 qa1 journal: Hyper-V VSS: VSS: op=FREEZE: succeeded
Apr 27 01:06:38 qa1 systemd: Time has been changed
Apr 27 01:06:38 qa1 kernel: sd 0:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Apr 27 01:06:38 qa1 kernel: sd 0:0:0:0: [storvsc] Add. Sense: Changed operating definition
Apr 27 01:06:38 qa1 kernel: sd 0:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Apr 27 01:06:38 qa1 journal: Hyper-V VSS: VSS: op=THAW: succeeded
Apr 27 01:06:43 qa1 kernel: sd 0:0:0:1: [storvsc] Sense Key : Unit Attention [current]
Apr 27 01:06:43 qa1 kernel: sd 0:0:0:1: [storvsc] Add. Sense: Changed operating definition
Apr 27 01:06:43 qa1 kernel: sd 0:0:0:1: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Apr 27 01:06:47 qa1 kernel: sd 0:0:0:2: [storvsc] Sense Key : Unit Attention [current]
Apr 27 01:06:47 qa1 kernel: sd 0:0:0:2: [storvsc] Add. Sense: Changed operating definition
Apr 27 01:06:47 qa1 kernel: sd 0:0:0:2: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Apr 27 03:18:51 qa1 kernel: sd 0:0:0:1: [storvsc] Sense Key : Unit Attention [current]
Apr 27 03:18:51 qa1 kernel: sd 0:0:0:1: [storvsc] Add. Sense: Changed operating definition
Apr 27 03:18:51 qa1 kernel: sd 0:0:0:1: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Apr 27 03:18:56 qa1 kernel: sd 0:0:0:0: [storvsc] Sense Key : Unit Attention [current]
Apr 27 03:18:56 qa1 kernel: sd 0:0:0:0: [storvsc] Add. Sense: Changed operating definition
Apr 27 03:18:56 qa1 kernel: sd 0:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Apr 27 03:18:59 qa1 kernel: sd 0:0:0:2: [storvsc] Sense Key : Unit Attention [current]
Apr 27 03:18:59 qa1 kernel: sd 0:0:0:2: [storvsc] Add. Sense: Changed operating definition
Apr 27 03:18:59 qa1 kernel: sd 0:0:0:2: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa
Apr 27 03:18:59 qa1 kernel: blk_update_request: I/O error, dev sdc, sector 104939120
Apr 27 03:18:59 qa1 kernel: XFS (sdc1): metadata I/O error: block 0x6413670 ("xlog_iodone") error 5 numblks 64
Apr 27 03:18:59 qa1 kernel: XFS (sdc1): xfs_do_force_shutdown(0x2) called from line 1203 of file fs/xfs/xfs_log.c. Return address = 0xffffffffa0142b20
Apr 27 03:18:59 qa1 kernel: XFS (sdc1): Log I/O Error Detected. Shutting down filesystem
Apr 27 03:18:59 qa1 kernel: XFS (sdc1): Please umount the filesystem and rectify the problem(s)
Apr 27 03:18:59 qa1 kernel: XFS (sdc1): xfs_imap_to_bp: xfs_trans_read_buf() returned error -5.
Apr 27 03:18:59 qa1 kernel: XFS (sdc1): xfs_log_force: error -5 returned.
Apr 27 03:19:29 qa1 kernel: XFS (sdc1): xfs_log_force: error -5 returned.
Apr 27 03:19:59 qa1 kernel: XFS (sdc1): xfs_log_force: error -5 returned.

~0029199

Lemahasta (reporter)

I updated to kernel 514.16.1 and couldn't reproduce error. I updated 5 VM, all under same windows 2016 hypervisor (standard edition), one gen 2, rest gen1.

Windows server backup didn't cause XFS error on any of the VM's, I also did some random checkpoint-create-delete actions on machines that were usually experiencing issue more often than others.

I won't say it's fixed, as it was random error before and maybe I'm just lucky? Can anyone confirm/deny?

All testing done on:

windows 2016 standard with hyper-v role installed
VM's centos 7.3 kernel 514.16.1
built-in LIS
most VM's as gen1, one gen2.

~0029259

centosviper (reporter)

Same here, no crashes for the last 5 days.

CentOS 7.3.1611 (Core) / 3.10.0-514.16.1.el7.x86_64 kernel
Gen2 VM on Windows Server 2012 R2 Hyper-V (Standard)

~0029271

gilsbert (reporter)

Hi. Same environment as @centosviper.

CentOS 7.3.1611 (Core) / 3.10.0-514.16.1.el7.x86_64 kernel
Gen2 VM on Windows Server 2012 R2 Hyper-V (Standard)

Everything is working after testing more than 20 times in a row.

Log looks like this:
May 12 15:10:23 radius-ccuec journal: Hyper-V VSS: VSS: op=FREEZE: succeeded
May 12 15:10:23 radius-ccuec journal: Hyper-V VSS: VSS: op=THAW: succeeded
May 12 15:10:23 radius-ccuec kernel: sd 0:0:0:0: [storvsc] Sense Key : Unit Attention [current]
May 12 15:10:23 radius-ccuec kernel: sd 0:0:0:0: [storvsc] Add. Sense: Changed operating definition
May 12 15:10:23 radius-ccuec kernel: sd 0:0:0:0: Warning! Received an indication that the operating parameters on this target have changed. The Linux SCSI layer does not automa

So we still have a warning but it is the same on Ubuntu 17.04!

~0029273

Lemahasta (reporter)

After 10 days of normal nightly backups no xfs errors on any of the machines, so looks pretty much fixed from my point of view.

~0029301

ASPIRIN (reporter)

Same for my env.
After update to kernel 3.10.0-514.16.1.el7.x86_64
Having no problems with backup for 7 days.
+Notes

-Issue History
Date Modified Username Field Change
2017-02-09 10:00 fc7 New Issue
2017-02-09 10:00 fc7 Tag Attached: hyper-v
2017-02-09 10:08 fc7 Note Added: 0028550
2017-02-09 10:09 fc7 Tag Attached: xfs
2017-02-09 20:32 fc7 Tag Attached: kernel
2017-02-15 16:16 austinb Note Added: 0028581
2017-02-21 14:26 marcio Note Added: 0028624
2017-02-24 15:01 marcio Note Added: 0028685
2017-03-07 20:27 marcio Note Added: 0028788
2017-03-20 13:31 centosviper Note Added: 0028889
2017-03-30 21:32 rikerben Note Added: 0028981
2017-03-31 05:51 centosviper Note Added: 0028986
2017-04-08 18:30 Lemahasta Note Added: 0029042
2017-04-27 08:44 ASPIRIN Note Added: 0029164
2017-05-02 08:15 Lemahasta Note Added: 0029199
2017-05-10 06:41 centosviper Note Added: 0029259
2017-05-12 18:26 gilsbert Note Added: 0029271
2017-05-12 21:21 Lemahasta Note Added: 0029273
2017-05-19 14:12 ASPIRIN Note Added: 0029301
+Issue History