View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0018498 | CentOS-7 | kernel | public | 2022-08-22 08:56 | 2022-09-02 06:29 |
Reporter | ksp | Assigned To | |||
Priority | normal | Severity | major | Reproducibility | always |
Status | new | Resolution | open | ||
Product Version | 7.9.2009 | ||||
Summary | 0018498: NVMe SSD write error on CentOS 7 | ||||
Description | When the NVMe Protection Information type is 1, a write error occurs. In some controllers, when PRACT is set and PI type is 1, check whether a reference tag is valid. Therefore, in this case, PRCHK[0] and the reference tag must be set together. In CentOS 7, since only PRACT is set, the reference tag is invalid and I/O fails. | ||||
Steps To Reproduce | 1. Format an nvme with the options below # nvme format /dev/nvme0n1 -n 0x1 -i 1 -p 0 -m 1 -l 2 -r 2. Run fio # fio nvme0n1_256k_write.fio "nvme0n1_256k_write.fio" [global] ioengine=libaio atomic=0 randrepeat=1 allrandrepeat=1 offset=0% overwrite=1 size=100% group_reporting allow_mounted_write=1 refill_buffers [job] rw=write iodepth=256 direct=1 bs=256k offset=0% size=20g filename=/dev/nvme0n1 Actual results: 1. fio result job: (g=0): rw=write, bs=(R) 256KiB-256KiB, (W) 256KiB-256KiB, (T) 256KiB-256KiB, ioengine=libaio, iodepth=256 fio-3.7 Starting 1 process fio: pid=2038, err=5/file:io_u.c:1747, func=io_u error, error=Input/output error job: (groupid=0, jobs=1): err= 5 (file:io_u.c:1747, func=io_u error, error=Input/output error): pid=2038: Thu Aug 12 10:18:13 2027 write: IOPS=5140, BW=5120KiB/s (5243kB/s)(256KiB/50msec) slat (usec): min=13, max=4291, avg=52.47, stdev=265.51 clat (nsec): min=48795k, max=48795k, avg=48794666.00, stdev= 0.00 lat (nsec): min=48859k, max=48859k, avg=48858894.00, stdev= 0.00 clat percentiles (usec): | 1.00th=[49021], 5.00th=[49021], 10.00th=[49021], 20.00th=[49021], | 30.00th=[49021], 40.00th=[49021], 50.00th=[49021], 60.00th=[49021], | 70.00th=[49021], 80.00th=[49021], 90.00th=[49021], 95.00th=[49021], | 99.00th=[49021], 99.50th=[49021], 99.90th=[49021], 99.95th=[49021], | 99.99th=[49021] lat (msec) : 50=0.39% cpu : usr=61.22%, sys=36.73%, ctx=8, majf=0, minf=68 IO depths : 1=0.4%, 2=0.8%, 4=1.6%, 8=3.1%, 16=6.2%, 32=12.5%, >=64=75.5% submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0% issued rwts: total=0,257,0,0 short=0,0,0,0 dropped=0,0,0,0 latency : target=0, window=0, percentile=100.00%, depth=256 Run status group 0 (all jobs): WRITE: bw=5120KiB/s (5243kB/s), 5120KiB/s-5120KiB/s (5243kB/s-5243kB/s), io=256KiB (262kB), run=50-50msec Disk stats (read/write): nvme0n1: ios=0/0, merge=0/0, ticks=0/0, in_queue=0, util=0.00% 2. dmesg [ 230.125582] nvme nvme0: rescanning namespaces. [ 233.191692] blk_update_request: I/O error, dev nvme0n1, sector 512 [ 233.191863] blk_update_request: I/O error, dev nvme0n1, sector 1024 [ 233.194667] blk_update_request: I/O error, dev nvme0n1, sector 1536 [ 233.197739] blk_update_request: I/O error, dev nvme0n1, sector 2048 [ 233.200333] blk_update_request: I/O error, dev nvme0n1, sector 2560 [ 233.203220] blk_update_request: I/O error, dev nvme0n1, sector 3072 [ 233.205745] blk_update_request: I/O error, dev nvme0n1, sector 3584 [ 233.208454] blk_update_request: I/O error, dev nvme0n1, sector 4096 [ 233.210610] blk_update_request: I/O error, dev nvme0n1, sector 4608 [ 233.212787] blk_update_request: I/O error, dev nvme0n1, sector 5120 [ 247.313203] blk_update_request: 246 callbacks suppressed [ 247.313209] blk_update_request: I/O error, dev nvme0n1, sector 512 [ 247.313371] blk_update_request: I/O error, dev nvme0n1, sector 1024 [ 247.316103] blk_update_request: I/O error, dev nvme0n1, sector 1536 [ 247.318982] blk_update_request: I/O error, dev nvme0n1, sector 2048 [ 247.321544] blk_update_request: I/O error, dev nvme0n1, sector 2560 [ 247.324127] blk_update_request: I/O error, dev nvme0n1, sector 3072 [ 247.326994] blk_update_request: I/O error, dev nvme0n1, sector 3584 [ 247.329247] blk_update_request: I/O error, dev nvme0n1, sector 4096 [ 247.331691] blk_update_request: I/O error, dev nvme0n1, sector 4608 [ 247.333794] blk_update_request: I/O error, dev nvme0n1, sector 5120 Expected results: No error when run fio | ||||
Additional Information | In some controllers, when NVME_RW_PRINFO_PRACT is set and Protection Information type is 1, check whether a reference tag is valid. Therefore, in this case, NVME_RW_PRINFO_PRCHK_REF and the reference tag must be set together. In CentOS 7, since only NVME_RW_PRINFO_PRACT is set, the reference tag is invalid and I/O fails. If the following patch is added, no error occurs when writing. base code: CentOS 7(3.10.0-1160.71.1.el7.x86_64), drivers/nvme/host/core.c @@ -471,9 +471,23 @@ static inline int nvme_setup_discard(str return BLK_MQ_RQ_QUEUE_OK; } +static inline u32 t10_pi_ref_tag(struct request *rq) +{ + #define SECTOR_SHIFT 9 + unsigned int shift = ilog2(queue_logical_block_size(rq->q)); + + return blk_rq_pos(rq) >> (shift - SECTOR_SHIFT) & 0xffffffff; +} + static inline int nvme_setup_rw(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd) { + #define NVME_NS_DPS_PI_TYPE1 1 + #define NVME_NS_DPS_PI_TYPE2 2 + #define NVME_NS_DPS_PI_TYPE3 3 + #define NVME_RW_PRINFO_PRCHK_GUARD (1 << 12) + #define NVME_RW_PRINFO_PRCHK_REF (1 << 10) + u16 control = 0; u32 dsmgmt = 0; @@ -507,6 +521,18 @@ static inline int nvme_setup_rw(struct n if (ns->ms) { if (!blk_integrity_rq(req)) control |= NVME_RW_PRINFO_PRACT; + + switch (ns->pi_type) { + case NVME_NS_DPS_PI_TYPE3: + control |= NVME_RW_PRINFO_PRCHK_GUARD; + break; + case NVME_NS_DPS_PI_TYPE1: + case NVME_NS_DPS_PI_TYPE2: + control |= NVME_RW_PRINFO_PRCHK_GUARD | + NVME_RW_PRINFO_PRCHK_REF; + cmnd->rw.reftag = cpu_to_le32(t10_pi_ref_tag(req)); + break; + } } cmnd->rw.control = cpu_to_le16(control); The above patch refer to some of the latest Linux kernel code. | ||||
Tags | kernel | ||||
abrt_hash | |||||
URL | |||||
CentOS is a rebuild of the sources used to create RHEL. We do not modify anything except to remove branding and logos. You will need to submit your request to Redhat via bugzilla.redhat.com and if/when RH accepts it and incorporates it into RHEL and releases a patched version, then CentOS will pick it up and rebuild it. | |
Thank you for your reply. I will do that. |
|