View Issue Details

IDProjectCategoryView StatusLast Update
0015859CentOS-7kernelpublic2019-05-03 15:07
Status resolvedResolutionfixed 
Platformx86_64OSCentOSOS Version7
Product Version7.6.1810 
Target VersionFixed in Version 
Summary0015859: System crashes at "kernel BUG at mm/usercopy.c:72!"
DescriptionA little bit of background: There a script that collects monitoring data by connecting to multiple hosts via telnet every minute. The script has been working fine for a few years, then suddenly after the latest update the system started to crash at least once every 20 minutes.

 Currently the investigation produced the following results:

- The kernel was updated from 3.10.0-862.14.4.el7.x86_64 to 3.10.0-957.5.1.el7.x86_64 and that's when the problem started.

- `dmesg` contains the following:

    [614918.494483] usercopy: kernel memory exposure attempt detected from ffff9f503d7aa005 (kmalloc-4096) (8187 bytes)
    [614918.505563] ------------[ cut here ]------------
    [614918.511080] kernel BUG at mm/usercopy.c:72!

- The part about "kernel BUG at mm/usercopy.c:72!" is consistent across all crashes.

- Looking further with crash reveals the following:

    KERNEL: /usr/lib/debug/lib/modules/3.10.0-957.5.1.el7.x86_64/vmlinux
    DUMPFILE: /var/crash/ [PARTIAL DUMP]
    CPUS: 56
    DATE: Wed Feb 20 15:11:58 2019
    UPTIME: 00:18:52
    LOAD AVERAGE: 1.41, 1.84, 1.86
    TASKS: 2052
    RELEASE: 3.10.0-957.5.1.el7.x86_64
    VERSION: #1 SMP Fri Feb 1 14:54:57 UTC 2019
    MACHINE: x86_64 (2594 Mhz)
    MEMORY: 127.9 GB
    PANIC: "kernel BUG at mm/usercopy.c:72!"
    PID: 27982
    COMMAND: "telnet"
    TASK: ffff8e59f4e44100 [THREAD_INFO: ffff8e594e630000]
    CPU: 29

- Backtrace:

     #1 [ffff8e594e633a20] __crash_kexec at ffffffff87d1cf32
     #2 [ffff8e594e633af0] crash_kexec at ffffffff87d1d020
     #3 [ffff8e594e633b08] oops_end at ffffffff8836c758
     #4 [ffff8e594e633b30] die at ffffffff87c2f95b
     #5 [ffff8e594e633b60] do_trap at ffffffff8836bea0
     #6 [ffff8e594e633bb0] do_invalid_op at ffffffff87c2c2a4
     #7 [ffff8e594e633c60] invalid_op at ffffffff8837812e
        [exception RIP: __check_object_size+135]
        RIP: ffffffff87e3e4a7 RSP: ffff8e594e633d18 RFLAGS: 00010246
        RAX: 0000000000000063 RBX: ffff8e5abd0b9005 RCX: 0000000000000000
        RDX: 0000000000000000 RSI: ffff8e4abf9d3898 RDI: ffff8e4abf9d3898
        RBP: ffff8e594e633d38 R8: 0000000000000000 R9: ffff8e4ab97c6f00
        R10: 0000000000000777 R11: 0000000000000001 R12: 0000000000001ffb
        R13: 0000000000000001 R14: ffff8e5abd0bb000 R15: ffff8e5abae30800
        ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018

- The content of RIP is also the same across crashes.

- Content of CS: 0010 suggests that the crash happens when the system is in kernel mode, not in userspace, since it ends with an even number.

- To sum it up it looks like during the execution of telnet some memory operations don't work as expected and cause the system to crash. May be further analysis of the crash dumps can shed more light on the matter but that has not been done.

I was able to find a somewhat similar issue: ​centos 7.6 kernel panic caused by osd
Steps To ReproduceOn my system the issue can be reproduced by enabling certain scripts that utilize telnet. The crash may happen in just 1 or two minutes but on average it takes about 20 minutes, which means that during that time telnet was invoked about 2000 times. Each session lasts just a few seconds.
Tagscentos7, memory


2019-03-23 11:25

reporter   ~0034068

I had the same problem. It appears to me that it is fixed in the latest CentOS 7. 6.1820 kernel: 3.10.0-957.10.1.el7.x86_64. I was able to reproduce the crash and now my tests are negative. At there is only one note: "* hardened usercopy is causing crash (BZ#1660815)" I did not easily find any reference as to what or how it was fixed. There are other URL's regarding the same kernel updates with the same note about usercopy. My tests, so far, do not cause the crash. On boot up the "abrtd (8) - automated bug reporting tool's daemon." reports the bug exists for the previous two 3.10.0-957.1... and 3.10.0-957.5... kernels. The results of the abrtd reports are very similar to the original posts from ph0enix.


2019-03-26 10:22

reporter   ~0034094

Thank you for providing an update. We will try a new kernel and I will update the issue then.


2019-05-03 10:35

reporter   ~0034416

After updating the kernel to a newer version (3.10.0-957.10.1.el7.x86_64) I can confirm that the issue is gone, the server is stable and no reboots occur.


2019-05-03 15:07

manager   ~0034418

Thank you for reporting back. Closing as resolved.

Issue History

Date Modified Username Field Change
2019-02-22 13:37 Ph0enix New Issue
2019-02-22 13:37 Ph0enix Tag Attached: centos7
2019-02-22 13:37 Ph0enix Tag Attached: memory
2019-03-23 11:25 Note Added: 0034068
2019-03-26 10:22 Ph0enix Note Added: 0034094
2019-05-03 10:35 Ph0enix Note Added: 0034416
2019-05-03 15:07 toracat Status new => resolved
2019-05-03 15:07 toracat Resolution open => fixed
2019-05-03 15:07 toracat Note Added: 0034418