View Issue Details

IDProjectCategoryView StatusLast Update
0009646CentOS-7-OTHERpublic2016-11-21 18:10
Reporterdfinley66 
PrioritynormalSeveritycrashReproducibilityalways
Status resolvedResolutionfixed 
PlatformHPOSCentOS Linux OS Version7.1.1503
Product Version7.1-1503 
Target VersionFixed in Version 
Summary0009646: bugzilla bug #95211 exists in CentOS 7.1.1503 - crash when using VTI and IPSEC
DescriptionThe xfrm_input routine within xfrm_input.c has a call to xfrm_tunnel_check which has a reference to the "outer_mode" structure in the ipsec SA area. But when an SA is setup by doing an ALLOCSPI call followed by a later NEWSA call, there is a window of time whereby the SA area exists but does not have the outer_mode structure allocated. If a packet arrives from the ipsec client in this window, the reference to the NULL outer_mode pointer in the xfrm_tunnel_check routine causes a crash. This bug is documented in bugzilla as #95211. The fix is to move the call to xfrm_tunnel_check down just past the checks for tunnel state (see bugzilla 95211 documentation).
Steps To Reproducewe run an ipsec driver that begins sending ping tests before the SA is fully negotiated. The crash is easily reproducible by us.
Additional Informationwe are running kernel linux-3.10.0-229.11.1.el7. I downloaded the kernel source for this 229.11.1 kernel as well as the next kernel, 229.14.1, and they both have the bug.
TagsNo tags attached.
abrt_hash
URL

Activities

dfinley66

dfinley66

2015-10-23 15:27

reporter  

xfrm bug 95211.rtf (6,463 bytes)
toracat

toracat

2015-10-23 16:04

manager   ~0024686

Last edited: 2015-10-23 16:10

View 2 revisions

Could you report this bug at http://bugzilla.redhat.com so that this is fixed in the RHEL kernel? Then CentOS kernel will inherit the fix.

In the meantime, we can try including the patch referenced in the bugzilla in the centosplus kernel:

https://bugzilla.kernel.org/show_bug.cgi?id=95211

    Commit 70be6c91c86596ad2b60c73587880b47df170a41
    ("xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer") added check
    which dereferences ->outer_mode too early but larval SAs don't have
    this pointer set (yet). So check for tunnel stuff later.

Also, can you try testing kernel-ml from ELRepo (http://elrepo.org/tiki/kernel-ml)? This kernel is supposed to have that patch, so will make it a good test to confirm the issue is fixed.

toracat

toracat

2015-10-23 16:37

manager  

bug9646.patch (2,746 bytes)
CentOS patch bug #9646

commit 68c11e98ef6748ddb63865799b12fc45abb3755d                                            
Author: Alexey Dobriyan <adobriyan@gmail.com>                                              
Date:   Thu Apr 2 10:58:24 2015 +0300                                                      

    xfrm: fix xfrm_input/xfrm_tunnel_check oops
                                               
    https://bugzilla.kernel.org/show_bug.cgi?id=95211
                                                     
    Commit 70be6c91c86596ad2b60c73587880b47df170a41  
    ("xfrm: Add xfrm_tunnel_skb_cb to the skb common buffer") added check
    which dereferences ->outer_mode too early but larval SAs don't have  
    this pointer set (yet). So check for tunnel stuff later.             
                                                                         
    Mike Noordermeer reported this bug and patiently applied all the debugging.
                                                                               
    Technically this is remote-oops-in-interrupt-context type of thing.        
                                                                               
    BUG: unable to handle kernel NULL pointer dereference at 0000000000000034  
    IP: [<ffffffff8150dca2>] xfrm_input+0x3c2/0x5a0                            
        ...                                                                    
    [<ffffffff81500fc6>] ? xfrm4_esp_rcv+0x36/0x70                             
    [<ffffffff814acc9a>] ? ip_local_deliver_finish+0x9a/0x200                  
    [<ffffffff81471b83>] ? __netif_receive_skb_core+0x6f3/0x8f0                
        ...

    RIP  [<ffffffff8150dca2>] xfrm_input+0x3c2/0x5a0
    Kernel panic - not syncing: Fatal exception in interrupt

    Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com>
    Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>

    Applied-by: Akemi Yagi <toracat@centos.org>

--- a/net/xfrm/xfrm_input.c	2015-08-25 07:51:22.000000000 -0700
+++ b/net/xfrm/xfrm_input.c	2015-10-23 09:23:00.261464354 -0700
@@ -163,11 +163,6 @@ int xfrm_input(struct sk_buff *skb, int
 
 		skb->sp->xvec[skb->sp->len++] = x;
 
-		if (xfrm_tunnel_check(skb, x, family)) {
-			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEMODEERROR);
-			goto drop;
-		}
-
 		spin_lock(&x->lock);
 		if (unlikely(x->km.state != XFRM_STATE_VALID)) {
 			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEINVALID);
@@ -191,6 +186,11 @@ int xfrm_input(struct sk_buff *skb, int
 
 		spin_unlock(&x->lock);
 
+		if (xfrm_tunnel_check(skb, x, family)) {
+			XFRM_INC_STATS(net, LINUX_MIB_XFRMINSTATEMODEERROR);
+			goto drop;
+		}
+
 		seq_hi = htonl(xfrm_replay_seqhi(x, seq));
 
 		XFRM_SKB_CB(skb)->seq.input.low = seq;
bug9646.patch (2,746 bytes)
toracat

toracat

2015-10-23 16:38

manager   ~0024688

Patch for the centos-plus kernel uploaded.
dfinley66

dfinley66

2015-10-26 16:48

reporter   ~0024700

Sorry, but i dont have a Redhat subscription so i cant login to bugzilla.redhat.com to report the bug. We also can not use the ml kernel as we have some 3rd party software that we are going to be running that has not been certified with the latest 4.x kernels. Our desire is to stay at the kernel 3.10 release that came with our Centos 7 distribution, if possible. I'm going to pull down the kernel 3.10-0-229 kernel source from the vault.centos.org site and try to implement the patch myself as this problem is holding up the testing of our current offering. If/when there is an official 3.10 kernel with the fix, i'll then pull it down and upgrade to it. Thx.
toracat

toracat

2015-10-26 16:53

manager   ~0024701

Please don't worry about kernel-ml. It was for testing purposes only.

Regarding filing a bug report at RH bugzilla, you do not need a subscription. Just set up an account there and you should be able to submit a bug.
dfinley66

dfinley66

2015-10-26 19:32

reporter   ~0024702

Bug has been opened on RH bugzilla, bug number is 1275397. thx
toracat

toracat

2015-10-26 19:44

manager   ~0024703

Kernel-related bugs become private automatically. So, if/when any progress is made in that bugzilla, please update the status here.
toracat

toracat

2015-10-28 16:56

manager   ~0024721

I've built the current plus kernel with the patch applied (kernel-plus-3.10.0-229.14.1.2.el7.centos.plus). You can find the packages here:

http://people.centos.org/toracat/kernel/7/plus/bug9646/

Please note that they are not signed and are provided for testing purposes only.
dfinley66

dfinley66

2015-10-28 21:29

reporter   ~0024724

ok thx for the quick turnaround. The only question i have is that we are currently running the slightly older 229.11.1 release of the kernel. Should i download all of the packages that you posted and install them and if so, is there any particular order of the installs?
toracat

toracat

2015-10-28 21:46

manager   ~0024725

The minimum installation is to just update the "kernel" package, in this case, kernel-plus-3.10.0-229.14.1.2.el7.centos.plus.x86_64. Download this file and install it by running:

yum localinstall kernel-plus-3.10.0-229.14.1.2.el7.centos.plus.x86_64.rpm

You can install other packages (such as -devel) by using the same yum command.

Then reboot to the newly installed kernel.
dfinley66

dfinley66

2015-10-29 20:58

reporter   ~0024736

Success! We installed the kernel-plus 3.10.0-229.14.1.2 kernel and our problem does not occur as it was before. We will continue testing with this kernel. I checked the bugzilla.redhat site, no update on my ticket yet. I'll continue to monitor that site for updates and will pass along status to you. Again, thx for your help on this, much appreciated.
toracat

toracat

2015-10-29 21:05

manager   ~0024737

Great news. Yes, please update with the upstream progress. In the meantime, you can continue to use the plus kernel. Just remember it is a testing release. The next version of the official kernel-plus should have the patch.
toracat

toracat

2015-10-29 21:10

manager   ~0024738

And please be sure to report your positive test result in your RH bugzilla.
dfinley66

dfinley66

2016-02-04 18:48

reporter   ~0025613

Hello. I just received an update on the bug i opened on bugzilla.redhat.com indicating that bug #95211 has been fixed in kernel version: kernel-3.10.0-345.el7. Not familiar with how your process works at this point - do you eventually then get this new kernel, test it and then post it as an update on the centos 7 update site ? thx.
toracat

toracat

2016-02-04 18:56

manager   ~0025614

Sounds like the patched kernel will be in RHEL 7.3 (that is, CentOS 7.3). There is a possibility that the fix appears in 7.2. Please continue to monitor the BZ.
toracat

toracat

2016-11-19 18:08

manager   ~0027938

The patch from this bug report is indeed in the RHEL 7.3 kernel (3.10.0-514.el7), so will be removed from the plus kernel. This version of kernel will be published in the CentOS 7.2 CR repo soon.
toracat

toracat

2016-11-21 18:10

manager   ~0027958

Closing as 'resolved' now that the patch is in the distro kernel (7.3.1611).

Issue History

Date Modified Username Field Change
2015-10-23 15:27 dfinley66 New Issue
2015-10-23 15:27 dfinley66 File Added: xfrm bug 95211.rtf
2015-10-23 16:04 toracat Note Added: 0024686
2015-10-23 16:05 toracat Status new => assigned
2015-10-23 16:10 toracat Note Edited: 0024686 View Revisions
2015-10-23 16:37 toracat File Added: bug9646.patch
2015-10-23 16:38 toracat Note Added: 0024688
2015-10-26 16:48 dfinley66 Note Added: 0024700
2015-10-26 16:53 toracat Note Added: 0024701
2015-10-26 19:32 dfinley66 Note Added: 0024702
2015-10-26 19:44 toracat Note Added: 0024703
2015-10-28 16:56 toracat Note Added: 0024721
2015-10-28 21:29 dfinley66 Note Added: 0024724
2015-10-28 21:46 toracat Note Added: 0024725
2015-10-29 20:58 dfinley66 Note Added: 0024736
2015-10-29 21:05 toracat Note Added: 0024737
2015-10-29 21:10 toracat Note Added: 0024738
2016-02-04 18:48 dfinley66 Note Added: 0025613
2016-02-04 18:56 toracat Note Added: 0025614
2016-11-19 18:08 toracat Note Added: 0027938
2016-11-21 18:10 toracat Status assigned => resolved
2016-11-21 18:10 toracat Resolution open => fixed
2016-11-21 18:10 toracat Note Added: 0027958