View Issue Details

IDProjectCategoryView StatusLast Update
0003582CentOS-5kernel-xenpublic2009-05-03 21:33
Reporterdeepunix 
PrioritynormalSeverityminorReproducibilityhave not tried
Status newResolutionopen 
Product Version5.2 
Target VersionFixed in Version 
Summary0003582: BUG: soft lockup - CPU#0 stuck for 10s! [pluto:23804]
DescriptionI've got the following lockup, after which domain stopped working properly:

kernel: BUG: soft lockup - CPU#0 stuck for 10s! [pluto:23804]
kernel:
kernel: Pid: 23804, comm: pluto
kernel: EIP: 0061:[<c060fb63>] CPU: 0
kernel: EIP is at _spin_lock+0xa/0xf
kernel: EFLAGS: 00200286 Not tainted (2.6.18-128.1.6.el5.cento
kernel: EAX: d7f4156c EBX: d7f41540 ECX: d7f4156c EDX: 00000100
kernel: ESI: 00000000 EDI: 00000000 EBP: d7f41540 DS: 007b ES: 007
kernel: CR0: 8005003b CR2: b7f13000 CR3: 101b6000 CR4: 00000660
kernel: [<c05ad55c>] release_sock+0x59/0x91
kernel: [<c05ea674>] udp_sendmsg+0x44e/0x514
kernel: [<c05efbb0>] inet_sendmsg+0x35/0x3f
kernel: [<c05aaec0>] sock_sendmsg+0xce/0xe8
kernel: [<c042fe0f>] autoremove_wake_function+0x0/0x2d
kernel: [<c04e7291>] copy_to_user+0x31/0x48
kernel: [<c04e7085>] copy_from_user+0x31/0x5d
kernel: [<c05ac078>] sys_sendto+0x116/0x140
kernel: [<c05ac998>] sys_socketcall+0x106/0x19e
kernel: [<c0405413>] syscall_call+0x7/0xb
kernel: =======================

A couple of days ago i've got several such messages:
kernel: SKB BUG: Invalid truesize (300) len=148, sizeof(sk_buff)=172
kernel: SKB BUG: Invalid truesize (300) len=148, sizeof(sk_buff)=172
kernel: SKB BUG: Invalid truesize (300) len=148, sizeof(sk_buff)=172
kernel: SKB BUG: Invalid truesize (300) len=148, sizeof(sk_buff)=172
kernel: SKB BUG: Invalid truesize (300) len=148, sizeof(sk_buff)=172

Anyone knows what's going on here ?
Additional InformationLinux xx 2.6.18-128.1.6.el5.centos.plusxen #1 SMP Thu Apr 2 14:15:05 EDT 2009 i686 i686 i386 GNU/Linux

# ipsec version
Linux strongSwan U4.2.11/K2.6.18-128.1.6.el5.centos.plusxen
TagsNo tags attached.

Activities

deepunix

deepunix

2009-04-27 15:39

reporter   ~0009266

It happened again today:

kernel: BUG: soft lockup - CPU#0 stuck for 10s! [pluto:10731]
kernel:
kernel: Pid: 10731, comm: pluto
kernel: EIP: 0061:[<c060fb60>] CPU: 0
kernel: EIP is at _spin_lock+0x7/0xf
kernel: EFLAGS: 00200286 Not tainted (2.6.18-128.1.6.el5.centos.plusxen #1)
kernel: EAX: cd8b716c EBX: cd8b7140 ECX: cd8b716c EDX: 00000100
kernel: ESI: 00000000 EDI: 00000000 EBP: cd8b7140 DS: 007b ES: 007b
kernel: CR0: 8005003b CR2: b7f54000 CR3: 0e933000 CR4: 00000660
kernel: [<c05ad55c>] release_sock+0x59/0x91
kernel: [<c05ea674>] udp_sendmsg+0x44e/0x514
kernel: [<c05efbb0>] inet_sendmsg+0x35/0x3f
kernel: [<c05aaec0>] sock_sendmsg+0xce/0xe8
kernel: [<c042fe0f>] autoremove_wake_function+0x0/0x2d
kernel: [<c04e7085>] copy_from_user+0x31/0x5d
kernel: [<c05ac078>] sys_sendto+0x116/0x140
kernel: [<c05ac998>] sys_socketcall+0x106/0x19e
kernel: [<c0405413>] syscall_call+0x7/0xb
kernel: =======================
deepunix

deepunix

2009-04-29 08:43

reporter   ~0009269

okay, found the same bug here: https://bugzilla.redhat.com/show_bug.cgi?id=484590 .
totalteam

totalteam

2009-05-03 21:33

reporter   ~0009299

Concur - we get the same with kernel 2.6.18-128.1.6.el5.centos.plus

The only fix so far is to downgrade to an earlier kernel.

This affects hosts which run openswan, and it appears quicker on hosts that do more traffic over said ipsec tunnels.

Issue History

Date Modified Username Field Change
2009-04-23 17:01 deepunix New Issue
2009-04-27 15:39 deepunix Note Added: 0009266
2009-04-29 08:43 deepunix Note Added: 0009269
2009-05-03 21:33 totalteam Note Added: 0009299