2017-11-17 21:17 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0008016CentOS-6kernelpublic2015-12-21 20:42
Reportermsw 
PrioritynormalSeveritymajorReproducibilityalways
StatusresolvedResolutionfixed 
Product Version6.6 
Target VersionFixed in Version 
Summary0008016: Kernel fails to boot on HVM Xen domU when >32 vCPUs are allocated to a guest
DescriptionBooting a HVM domU with 33 vCPUs allocated results in a hang at boot.

Initializing cgroup subsys cpuset
Initializing cgroup subsys cpu
Linux version 2.6.32-504.3.3.el6.x86_64 (mockbuild@x86-028.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-9) (GCC) ) #1 SMP Fri Dec 12 16:05:43 EST 2014
...
NMI watchdog disabled (cpu0): hardware events not enabled
Booting Node 0, Processors #1
 #2
 #3
 #4
 #5
 #6
 #7
 #8
 Ok.
Booting Node 1, Processors #9
 #10
 #11
 #12
 #13
 #14
 #15
 #16
 #17
 Ok.
Booting Node 0, Processors #18
 #19
 #20
 #21
 #22
 #23
 #24
 #25
 #26
 Ok.
Booting Node 1, Processors #27
 #28
 #29
 #30
 #31
 #32
CPU32: Stuck ??
 #33
Brought up 32 CPUs
Total of 32 processors activated (185605.57 BogoMIPS).
BUG: soft lockup - CPU#0 stuck for 67s! [swapper:1]
Modules linked in:
CPU 0
Modules linked in:

Pid: 1, comm: swapper Not tainted 2.6.32-504.3.3.el6.x86_64 #1 Xen HVM domU
RIP: 0010:[<ffffffff810b7438>] [<ffffffff810b7438>] smp_call_function_many+0x1e8/0x260
RSP: 0018:ffff8807713a3ca0 EFLAGS: 00000202
RAX: 0000000000000080 RBX: ffff8807713a3ce0 RCX: 0000000000000020
RDX: 0000000000000020 RSI: 0000000000000080 RDI: 0000000000000000
RBP: ffffffff8100bb8e R08: ffff88077bab0c10 R09: 0000000000000080
R10: 00000000000001c8 R11: 0000000000000000 R12: ffff880f0f800500
R13: ffff880f0f8004c0 R14: ffff88077bae0600 R15: ffffffff81324699
FS: 0000000000000000(0000) GS:ffff880043c00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001a85000 CR4: 00000000001406f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff8807713a2000, task ffff8807713a1500)
Stack:
 01ff8807713a3cd0 ffff880f04ae0000 0000000000000246 ffffffff81172f60
<d> ffff880f04ae0000 00000000ffffffff ffff8807704f0d40 000000000000ebac
<d> ffff8807713a3cf0 ffffffff810b74d2 ffff8807713a3d20 ffffffff8107d594
Call Trace:
 [<ffffffff81172f60>] ? do_ccupdate_local+0x0/0x40
 [<ffffffff810b74d2>] ? smp_call_function+0x22/0x30
 [<ffffffff8107d594>] ? on_each_cpu+0x24/0x50
 [<ffffffff81176080>] ? do_tune_cpucache+0x110/0x550
 [<ffffffff8117668b>] ? enable_cpucache+0x3b/0xf0
 [<ffffffff81512a8e>] ? setup_cpu_cache+0x21e/0x270
 [<ffffffff8117733a>] ? kmem_cache_create+0x41a/0x5a0
 [<ffffffff81142620>] ? shmem_init_inode+0x0/0x20
 [<ffffffff81c50525>] ? shmem_init+0x3c/0xd7
 [<ffffffff81c298b4>] ? kernel_init+0x26b/0x2f7
 [<ffffffff8100c20a>] ? child_rip+0xa/0x20
 [<ffffffff81c29649>] ? kernel_init+0x0/0x2f7
 [<ffffffff8100c200>] ? child_rip+0x0/0x20
Code: e8 4e 5a 47 00 0f ae f0 48 8b 7b 30 ff 15 49 1f 9e 00 80 7d c7 00 0f 84 9f fe ff ff f6 43 20 01 0f 84 95 fe ff ff 0f 1f 44 00 00 <f3> 90 f6 43 20 01 75 f8 e9 83 fe ff ff 0f 1f 00 4c 89 ea 4c 89
Call Trace:
 [<ffffffff81172f60>] ? do_ccupdate_local+0x0/0x40
 [<ffffffff810b74d2>] ? smp_call_function+0x22/0x30
 [<ffffffff8107d594>] ? on_each_cpu+0x24/0x50
 [<ffffffff81176080>] ? do_tune_cpucache+0x110/0x550
 [<ffffffff8117668b>] ? enable_cpucache+0x3b/0xf0
 [<ffffffff81512a8e>] ? setup_cpu_cache+0x21e/0x270
 [<ffffffff8117733a>] ? kmem_cache_create+0x41a/0x5a0
 [<ffffffff81142620>] ? shmem_init_inode+0x0/0x20
 [<ffffffff81c50525>] ? shmem_init+0x3c/0xd7
 [<ffffffff81c298b4>] ? kernel_init+0x26b/0x2f7
Steps To ReproduceSet up a Xen HVM domU with 33 vCPUs
Additional InformationApplying this upstream patch resolves this issue: https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=90d4f553

This problem has been reported to Red Hat as well.
TagsNo tags attached.
Attached Files
  • patch file icon 8016.patch (780 bytes) 2015-01-19 23:11 -
    diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
    --- a/arch/x86/xen/enlighten.c
    +++ b/arch/x86/xen/enlighten.c
    @@ -1373,7 +1373,7 @@ static int __cpuinit xen_hvm_cpu_notify(
            int cpu = (long)hcpu;
            switch (action) {
            case CPU_UP_PREPARE:
    -               per_cpu(xen_vcpu, cpu) = &HYPERVISOR_shared_info->vcpu_info[cpu];
    +               xen_vcpu_setup(cpu);
                    if (xen_have_vector_callback)
                            xen_init_lock_cpu(cpu);
                    break;
    @@ -1414,6 +1414,5 @@ void __init xen_hvm_guest_init(void)
            xen_hvm_smp_init();
            register_cpu_notifier(&xen_hvm_cpu_notifier);
            xen_unplug_emulated_devices();
    -       have_vcpu_info_placement = 0;
            x86_init.irqs.intr_init = xen_init_IRQ;
     }
    
    patch file icon 8016.patch (780 bytes) 2015-01-19 23:11 +

-Relationships
+Relationships

-Notes

~0022009

toracat (manager)

We can add the patch (attached) to the centosplus kernel.

~0022063

toracat (manager)

Just a data point: CentOS-7 kernels have the reference patch included.

~0022182

toracat (manager)

path added to the patch (8016-2.patch).

~0022264

toracat (manager)

This patch has been added as of kernel-2.6.32-504.8.1.el6.centos.plus.

~0025146

toracat (manager)

Fixed in distro kernel.
+Notes

-Issue History
Date Modified Username Field Change
2014-12-19 01:23 msw New Issue
2014-12-19 09:05 toracat File Added: 8016.patch
2014-12-19 09:06 toracat Note Added: 0022009
2014-12-19 09:06 toracat Status new => assigned
2014-12-31 17:01 toracat Note Added: 0022063
2015-01-19 23:04 toracat File Added: 8016-2.patch
2015-01-19 23:06 toracat Note Added: 0022182
2015-01-19 23:10 JohnnyHughes File Deleted: 8016.patch
2015-01-19 23:10 JohnnyHughes File Deleted: 8016-2.patch
2015-01-19 23:11 JohnnyHughes File Added: 8016.patch
2015-01-30 02:08 toracat Note Added: 0022264
2015-12-21 20:42 toracat Note Added: 0025146
2015-12-21 20:42 toracat Status assigned => resolved
2015-12-21 20:42 toracat Resolution open => fixed
+Issue History