View Issue Details

IDProjectCategoryView StatusLast Update
0018498CentOS-7kernelpublic2022-09-02 06:29
Reporterksp Assigned To 
PrioritynormalSeveritymajorReproducibilityalways
Status newResolutionopen 
Product Version7.9.2009 
Summary0018498: NVMe SSD write error on CentOS 7
DescriptionWhen the NVMe Protection Information type is 1, a write error occurs.

In some controllers, when PRACT is set and PI type is 1, check whether a reference tag is valid.
Therefore, in this case, PRCHK[0] and the reference tag must be set together.

In CentOS 7, since only PRACT is set, the reference tag is invalid and I/O fails.
Steps To Reproduce1. Format an nvme with the options below
# nvme format /dev/nvme0n1 -n 0x1 -i 1 -p 0 -m 1 -l 2 -r

2. Run fio
# fio nvme0n1_256k_write.fio

"nvme0n1_256k_write.fio"
[global]
ioengine=libaio
atomic=0
randrepeat=1
allrandrepeat=1
offset=0%
overwrite=1
size=100%
group_reporting
allow_mounted_write=1
refill_buffers

[job]
rw=write
iodepth=256
direct=1
bs=256k
offset=0%
size=20g
filename=/dev/nvme0n1

Actual results:
1. fio result

job: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=libaio, iodepth=256
fio-3.7
Starting 1 process
fio: pid=2038, err=5/file:io_u.c:1747, func=io_u error, error=Input/output error

job: (groupid=0, jobs=1): err= 5 (file:io_u.c:1747, func=io_u error, error=Input/output error): pid=2038: Thu Aug 12 10:18:13 2027
  write: IOPS=5140, BW=5120KiB/s (5243kB/s)(256KiB/50msec)
    slat (usec): min=13, max=4291, avg=52.47, stdev=265.51
    clat (nsec): min=48795k, max=48795k, avg=48794666.00, stdev= 0.00
     lat (nsec): min=48859k, max=48859k, avg=48858894.00, stdev= 0.00
    clat percentiles (usec):
     | 1.00th=[49021], 5.00th=[49021], 10.00th=[49021], 20.00th=[49021],
     | 30.00th=[49021], 40.00th=[49021], 50.00th=[49021], 60.00th=[49021],
     | 70.00th=[49021], 80.00th=[49021], 90.00th=[49021], 95.00th=[49021],
     | 99.00th=[49021], 99.50th=[49021], 99.90th=[49021], 99.95th=[49021],
     | 99.99th=[49021]
  lat (msec) : 50=0.39%
  cpu : usr=61.22%, sys=36.73%, ctx=8, majf=0, minf=68
  IO depths : 1=0.4%, 2=0.8%, 4=1.6%, 8=3.1%, 16=6.2%, 32=12.5%, >=64=75.5%
     submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued rwts: total=0,257,0,0 short=0,0,0,0 dropped=0,0,0,0
     latency : target=0, window=0, percentile=100.00%, depth=256

Run status group 0 (all jobs):
  WRITE: bw=5120KiB/s (5243kB/s), 5120KiB/s-5120KiB/s (5243kB/s-5243kB/s), io=256KiB (262kB), run=50-50msec

Disk stats (read/write):
  nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00%

2. dmesg
[ 230.125582] nvme nvme0: rescanning namespaces.
[ 233.191692] blk_update_request: I/O error, dev nvme0n1, sector 512
[ 233.191863] blk_update_request: I/O error, dev nvme0n1, sector 1024
[ 233.194667] blk_update_request: I/O error, dev nvme0n1, sector 1536
[ 233.197739] blk_update_request: I/O error, dev nvme0n1, sector 2048
[ 233.200333] blk_update_request: I/O error, dev nvme0n1, sector 2560
[ 233.203220] blk_update_request: I/O error, dev nvme0n1, sector 3072
[ 233.205745] blk_update_request: I/O error, dev nvme0n1, sector 3584
[ 233.208454] blk_update_request: I/O error, dev nvme0n1, sector 4096
[ 233.210610] blk_update_request: I/O error, dev nvme0n1, sector 4608
[ 233.212787] blk_update_request: I/O error, dev nvme0n1, sector 5120
[ 247.313203] blk_update_request: 246 callbacks suppressed
[ 247.313209] blk_update_request: I/O error, dev nvme0n1, sector 512
[ 247.313371] blk_update_request: I/O error, dev nvme0n1, sector 1024
[ 247.316103] blk_update_request: I/O error, dev nvme0n1, sector 1536
[ 247.318982] blk_update_request: I/O error, dev nvme0n1, sector 2048
[ 247.321544] blk_update_request: I/O error, dev nvme0n1, sector 2560
[ 247.324127] blk_update_request: I/O error, dev nvme0n1, sector 3072
[ 247.326994] blk_update_request: I/O error, dev nvme0n1, sector 3584
[ 247.329247] blk_update_request: I/O error, dev nvme0n1, sector 4096
[ 247.331691] blk_update_request: I/O error, dev nvme0n1, sector 4608
[ 247.333794] blk_update_request: I/O error, dev nvme0n1, sector 5120

Expected results:
No error when run fio
Additional InformationIn some controllers, when NVME_RW_PRINFO_PRACT is set and Protection Information type is 1, check whether a reference tag is valid.
Therefore, in this case, NVME_RW_PRINFO_PRCHK_REF and the reference tag must be set together.

In CentOS 7, since only NVME_RW_PRINFO_PRACT is set, the reference tag is invalid and I/O fails.

If the following patch is added, no error occurs when writing.

base code: CentOS 7(3.10.0-1160.71.1.el7.x86_64), drivers/nvme/host/core.c

@@ -471,9 +471,23 @@ static inline int nvme_setup_discard(str
        return BLK_MQ_RQ_QUEUE_OK;
 }

+static inline u32 t10_pi_ref_tag(struct request *rq)
+{
+ #define SECTOR_SHIFT 9
+ unsigned int shift = ilog2(queue_logical_block_size(rq->q));
+
+ return blk_rq_pos(rq) >> (shift - SECTOR_SHIFT) & 0xffffffff;
+}
+
 static inline int nvme_setup_rw(struct nvme_ns *ns, struct request *req,
                struct nvme_command *cmnd)
 {
+ #define NVME_NS_DPS_PI_TYPE1 1
+ #define NVME_NS_DPS_PI_TYPE2 2
+ #define NVME_NS_DPS_PI_TYPE3 3
+ #define NVME_RW_PRINFO_PRCHK_GUARD (1 << 12)
+ #define NVME_RW_PRINFO_PRCHK_REF (1 << 10)
+
        u16 control = 0;
        u32 dsmgmt = 0;

@@ -507,6 +521,18 @@ static inline int nvme_setup_rw(struct n
        if (ns->ms) {
                if (!blk_integrity_rq(req))
                        control |= NVME_RW_PRINFO_PRACT;
+
+ switch (ns->pi_type) {
+ case NVME_NS_DPS_PI_TYPE3:
+ control |= NVME_RW_PRINFO_PRCHK_GUARD;
+ break;
+ case NVME_NS_DPS_PI_TYPE1:
+ case NVME_NS_DPS_PI_TYPE2:
+ control |= NVME_RW_PRINFO_PRCHK_GUARD |
+ NVME_RW_PRINFO_PRCHK_REF;
+ cmnd->rw.reftag = cpu_to_le32(t10_pi_ref_tag(req));
+ break;
+ }
        }

        cmnd->rw.control = cpu_to_le16(control);

The above patch refer to some of the latest Linux kernel code.
Tagskernel
abrt_hash
URL

Activities

TrevorH

TrevorH

2022-08-22 09:16

manager   ~0038977

CentOS is a rebuild of the sources used to create RHEL. We do not modify anything except to remove branding and logos. You will need to submit your request to Redhat via bugzilla.redhat.com and if/when RH accepts it and incorporates it into RHEL and releases a patched version, then CentOS will pick it up and rebuild it.
ksp

ksp

2022-09-02 06:29

reporter   ~0038983

Thank you for your reply.
I will do that.

Issue History

Date Modified Username Field Change
2022-08-22 08:56 ksp New Issue
2022-08-22 08:56 ksp Tag Attached: kernel
2022-08-22 09:16 TrevorH Note Added: 0038977
2022-09-02 06:29 ksp Note Added: 0038983