View Issue Details

IDProjectCategoryView StatusLast Update
0018490CentOS-7kernelpublic2022-07-28 17:54
Reportervaibhav.nipunage@pavilion.io Assigned To 
PrioritynormalSeverityminorReproducibilitysometimes
Status newResolutionopen 
Product Version7.9.2009 
Summary0018490: [pNFS]: BUG: unable to handle kernel NULL pointer dereference at filelayout_initiate_commit+0x25/0x170 [nfs_layout_nfsv41_files]
DescriptionI configured pNFS and ran fio multiple times and I hit following NULL pointer
dereference issue at filelayout_initiate_commit.

Steps To ReproduceSetup details:
==============
[root@dev_mc ~]# cat /etc/centos-release
CentOS Linux release 7.9.2009 (Core)
[root@dev_mc ~]# uname -r
3.10.0-1160.71.1.el7.x86_64
[root@dev_mc ~]#

Configuration:
===============
1. Configured a namespace with 5 DSs & 1 MDS using NFS-Ganesha_3.5 +
GlusterFS_9.5
2. Mounted that namespace with nfs4.2 (pNFS)
3. Ran fio tests a few times


Mount:
========
[root@dev_mc ~]# mount -t nfs -o vers=4.2,hard 192.168.22.16:/ns1 /mnt

Repro test:
============
# Test1 was successful
[root@dev_mc ~]# fio --rw=write --ioengine=libaio --direct=0 --iodepth=16
--numjobs=10 --bs=1024k --group_reporting=1 --size=50G --runtime=3600
--time_based=1 --output-format=json+ --name=1658217138 --directory=/mnt

# Test2 was successful
[root@dev_mc ~]# fio --rw=write --ioengine=libaio --direct=0 --iodepth=16
--numjobs=10 --bs=4k --group_reporting=1 --size=50G --runtime=3600
--time_based=1 --output-format=json+ --name=1658220848 --directory=/mnt

# Test3 failed & host crashed while running this test
[root@dev_mc ~]# fio --rw=write --ioengine=libaio --direct=0 --iodepth=16
--numjobs=10 --bs=512k --group_reporting=1 --size=50G --runtime=3600
--time_based=1 --output-format=json+ --name=1658224481 --directory=/mnt
Additional InformationLogs:
=======
[1315670.532296] BUG: unable to handle kernel NULL pointer dereference at
0000000000000070
[1315670.533351] IP: [<ffffffffc06da2b5>] filelayout_initiate_commit+0x25/0x170
[nfs_layout_nfsv41_files]
[1315670.534133] PGD 800000208704c067 PUD 2ca6c05067 PMD 0
[1315670.534673] Oops: 0000 [#1] SMP
[1315670.535189] Modules linked in: tcp_diag inet_diag btrfs raid6_pq xor
dm_service_time ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat
nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 xt_addrtype iptable_filter
xt_conntrack nf_nat nf_conntrack br_netfilter bridge stp llc overlay(T)
dm_multipath nvme_rdma nvme_fabrics nvme_core nfsv3 nfs_layout_nfsv41_files
rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache fuse ib_isert iscsi_target_mod
ib_srpt target_core_mod xfs libcrc32c ib_srp scsi_transport_srp scsi_tgt
dm_mirror dm_region_hash dm_log dm_mod sb_edac intel_powerclamp coretemp
intel_rapl iosf_mbi rpcrdma kvm_intel kvm rdma_ucm irqbypass ib_iser rdma_cm
iw_cm libiscsi ib_umad scsi_transport_iscsi ib_ipoib crc32_pclmul
ghash_clmulni_intel ib_cm aesni_intel lrw gf128mul glue_helper mlx4_ib
ablk_helper ib_uverbs
[1315670.538675] cryptd ib_core ioatdma iTCO_wdt iTCO_vendor_support mei_me
mei sg lpc_ich i2c_i801 joydev pcspkr ipmi_ssif wmi ipmi_si acpi_power_meter
ipmi_devintf ipmi_msghandler acpi_pad nfsd auth_rpcgss nfs_acl lockd grace
sunrpc ip_tables ext4 mbcache jbd2 mlx4_en sd_mod crc_t10dif crct10dif_generic
ast drm_kms_helper syscopyarea sysfillrect mlx4_core sysimgblt fb_sys_fops ttm
igb drm ahci libahci crct10dif_pclmul crct10dif_common crc32c_intel libata
devlink ptp pps_core dca drm_panel_orientation_quirks i2c_algo_bit nfit
libnvdimm
[1315670.541879] CPU: 7 PID: 9525 Comm: fio Kdump: loaded Tainted: G
   ------------ T 3.10.0-1160.71.1.el7.x86_64 #1
[1315670.542549] Hardware name: Supermicro SYS-2028TP-HTR/X10DRT-P, BIOS 3.3
10/24/2020
[1315670.543226] task: ffff9cbfc6c14200 ti: ffff9cc70dbc4000 task.ti:
ffff9cc70dbc4000
[1315670.543893] RIP: 0010:[<ffffffffc06da2b5>] [<ffffffffc06da2b5>]
filelayout_initiate_commit+0x25/0x170 [nfs_layout_nfsv41_files]
[1315670.545256] RSP: 0018:ffff9cc70dbc7470 EFLAGS: 00010202
[1315670.545938] RAX: ffffffffc06da290 RBX: ffff9cc042884fc0 RCX:
ffff9cc70dbc75a0
[1315670.546624] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff9cc042884fc0
[1315670.547306] RBP: ffff9cc70dbc7498 R08: ffff9cc042885198 R09:
ffff9cab7b4e9e80
[1315670.547978] R10: ffff9cbfd93ed1e8 R11: 0000000000000800 R12:
0000000000000000
[1315670.548657] R13: ffff9cc70dbc74f0 R14: 0000000000000000 R15:
ffff9cc042884fc0
[1315670.549343] FS: 00007f812d8ae880(0000) GS:ffff9cbf3f7c0000(0000)
knlGS:0000000000000000
[1315670.550047] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1315670.550757] CR2: 0000000000000070 CR3: 00000020a8a10000 CR4:
00000000003607e0
[1315670.551484] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[1315670.552215] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[1315670.552944] Call Trace:
[1315670.553695] [<ffffffffc0b72372>] pnfs_generic_commit_pagelist+0x252/0x3a0
[nfsv4]
[1315670.554458] [<ffffffffc06da290>] ? filelayout_commit_pagelist+0x20/0x20
[nfs_layout_nfsv41_files]
[1315670.555222] [<ffffffffc06da285>] filelayout_commit_pagelist+0x15/0x20
[nfs_layout_nfsv41_files]
[1315670.556008] [<ffffffffc09ffb5e>] nfs_generic_commit_list+0xbe/0xf0 [nfs]
[1315670.556782] [<ffffffffc09ffc1c>] __nfs_commit_inode+0x8c/0x150 [nfs]
[1315670.557563] [<ffffffffc09ffcf0>] nfs_commit_inode+0x10/0x20 [nfs]
[1315670.558350] [<ffffffffc09ee230>] nfs_release_page+0x40/0xd0 [nfs]
[1315670.559139] [<ffffffffbb1bdfc5>] try_to_release_page+0x35/0x50
[1315670.559932] [<ffffffffbb1d3869>] shrink_page_list+0xa09/0xc30
[1315670.560727] [<ffffffffbb1d40a6>] shrink_inactive_list+0x1b6/0x5c0
[1315670.561529] [<ffffffffbb1d4b85>] shrink_lruvec+0x375/0x730
[1315670.562335] [<ffffffffbb1d4fb6>] shrink_zone+0x76/0x1a0
[1315670.563142] [<ffffffffbb1d54a0>] do_try_to_free_pages+0xf0/0x520
[1315670.563957] [<ffffffffbb1d59cc>] try_to_free_pages+0xfc/0x180
[1315670.564765] [<ffffffffbb1c9611>] __alloc_pages_nodemask+0x831/0xbe0
[1315670.565585] [<ffffffffbb2193d8>] alloc_pages_current+0x98/0x110
[1315670.566392] [<ffffffffbb1be077>] __page_cache_alloc+0x97/0xb0
[1315670.567189] [<ffffffffbb1bf2e8>] grab_cache_page_write_begin+0x68/0xc0
[1315670.567980] [<ffffffffc09ef282>] nfs_write_begin+0x62/0x200 [nfs]
[1315670.568768] [<ffffffffbb1bdcef>] generic_file_buffered_write+0x10f/0x270
[1315670.569546] [<ffffffffbb1c0962>] __generic_file_aio_write+0x1e2/0x400
[1315670.570310] [<ffffffffc09ef0d4>] nfs_file_write+0x94/0x1e0 [nfs]
[1315670.571056] [<ffffffffc09ef040>] ? nfs_file_splice_read+0xc0/0xc0 [nfs]


I tried to debug further and it looks like the crash happened while accessing
struct pnfs_layout_segment *lseg. Following is the crash tool details:


[root@dev_mc ~]# crash /usr/lib/debug/lib/modules/`uname -r`/vmlinux
/var/crash/127.0.0.1-2022-07-19-03\:27\:00/vmcore

crash> bt
PID: 9525 TASK: ffff9cbfc6c14200 CPU: 7 COMMAND: "fio"
 #0 [ffff9cc70dbc7100] machine_kexec at ffffffffbb0662f4
 #1 [ffff9cc70dbc7160] __crash_kexec at ffffffffbb122b62
 #2 [ffff9cc70dbc7230] crash_kexec at ffffffffbb122c50
 #3 [ffff9cc70dbc7248] oops_end at ffffffffbb791798
 #4 [ffff9cc70dbc7270] no_context at ffffffffbb075d14
 #5 [ffff9cc70dbc72c0] __bad_area_nosemaphore at ffffffffbb075fe2
 #6 [ffff9cc70dbc7310] bad_area_nosemaphore at ffffffffbb076104
 #7 [ffff9cc70dbc7320] __do_page_fault at ffffffffbb794750
 #8 [ffff9cc70dbc7390] do_page_fault at ffffffffbb794975
 #9 [ffff9cc70dbc73c0] page_fault at ffffffffbb790778
    [exception RIP: filelayout_initiate_commit+37]
    RIP: ffffffffc06da2b5 RSP: ffff9cc70dbc7470 RFLAGS: 00010202
    RAX: ffffffffc06da290 RBX: ffff9cc042884fc0 RCX: ffff9cc70dbc75a0
    RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff9cc042884fc0
    RBP: ffff9cc70dbc7498 R8: ffff9cc042885198 R9: ffff9cab7b4e9e80
    R10: ffff9cbfd93ed1e8 R11: 0000000000000800 R12: 0000000000000000
    R13: ffff9cc70dbc74f0 R14: 0000000000000000 R15: ffff9cc042884fc0
    ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
#10 [ffff9cc70dbc74a0] pnfs_generic_commit_pagelist at ffffffffc0b72372 [nfsv4]
#11 [ffff9cc70dbc7538] filelayout_commit_pagelist at ffffffffc06da285
[nfs_layout_nfsv41_files]
#12 [ffff9cc70dbc7548] nfs_generic_commit_list at ffffffffc09ffb5e [nfs]
#13 [ffff9cc70dbc7580] __nfs_commit_inode at ffffffffc09ffc1c [nfs]
#14 [ffff9cc70dbc7600] nfs_commit_inode at ffffffffc09ffcf0 [nfs]
#15 [ffff9cc70dbc7610] nfs_release_page at ffffffffc09ee230 [nfs]

crash> mod -s nfs_layout_nfsv41_files
/usr/lib/debug/usr/lib/modules/3.10.0-1160.71.1.el7.x86_64/kernel/fs/nfs/filelayout/nfs_layout_nfsv41_files.ko.debug
     MODULE NAME SIZE OBJECT FILE
ffffffffc06de180 nfs_layout_nfsv41_files 24116
/usr/lib/debug/usr/lib/modules/3.10.0-1160.71.1.el7.x86_64/kernel/fs/nfs/filelayout/nfs_layout_nfsv41_files.ko.debug

crash> dis -l filelayout_initiate_commit+37
/usr/src/debug/kernel-3.10.0-1160.71.1.el7/linux-3.10.0-1160.71.1.el7.x86_64/fs/nfs/filelayout/filelayout.c:
1029
0xffffffffc06da2b5 <filelayout_initiate_commit+37>: cmpl $0x1,0x70(%r12)
crash>


[root@dev_mc 127.0.0.1-2022-07-19-03:27:00]# vim
/usr/src/debug/kernel-3.10.0-1160.71.1.el7/linux-3.10.0-1160.71.1.el7.x86_64/fs/nfs/filelayout/filelayout.c

1025 static u32 calc_ds_index_from_commit(struct pnfs_layout_segment *lseg, u32
i)
1026 {
1027 struct nfs4_filelayout_segment *flseg = FILELAYOUT_LSEG(lseg);
1028
1029 if (flseg->stripe_type == STRIPE_SPARSE) <<<<<<<<<<<<<<<<<<<<<<<<<<<< Hit NULL pointer
1030 return i;
1031 else
1032 return nfs4_fl_calc_ds_index(lseg, i);
1033 }

1050 static int filelayout_initiate_commit(struct nfs_commit_data *data, int
how)
1051 {
1052 struct pnfs_layout_segment *lseg = data->lseg;
1053 struct nfs4_pnfs_ds *ds;
1054 struct rpc_clnt *ds_clnt;
1055 u32 idx;
1056 struct nfs_fh *fh;
1057
1058 idx = calc_ds_index_from_commit(lseg, data->ds_commit_index);<<<<<<<<< calling this function
1059 ds = nfs4_fl_prepare_ds(lseg, idx);
1060 if (!ds)
1061 goto out_err;
1062
1063 ds_clnt = nfs4_find_or_create_ds_client(ds->ds_clp, data->inode);
1064 if (IS_ERR(ds_clnt))
1065 goto out_err;

Thanks!
TagsNo tags attached.
abrt_hash
URL

Activities

There are no notes attached to this issue.

Issue History

Date Modified Username Field Change
2022-07-28 17:54 vaibhav.nipunage@pavilion.io New Issue