View Issue Details

IDProjectCategoryView StatusLast Update
0013265CentOS-7kernelpublic2018-01-30 02:33
Reporterhallyn 
PrioritynormalSeveritymajorReproducibilityalways
Status resolvedResolutionfixed 
PlatformAllOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0013265: PTRACE_GETEVENTMSG broken for pids in pidnamespaces
DescriptionIf you fork a child, which unshares CLONE_NEWPID, ptrace-attach to it, set PTRACE_O_TRACEFORK, then have the child fork, the ptrace eventmsg you receive will be the pid in the child's namespace, not yours. This is fixed by commit 4e52365f279564cef0ddd41db5237f0471381093 from 2014 which applies cleanly to the 3.10 kernel.

It was suggested on irc that i might ask for toracat to cherrypick the patch into the centosplus kernel,
Steps To ReproduceI can provide a test program, but the above desribes how to do it.
TagsNo tags attached.
abrt_hash
URL

Activities

toracat

toracat

2017-05-17 21:07

manager   ~0029292

I will try applying the patch in the next plus kernel update.
toracat

toracat

2017-05-30 16:55

manager   ~0029354

Turns out, unfortunately, applying the patch causes a kABI breakage. So we cannot fulfill the request.
hallyn

hallyn

2017-05-30 17:03

reporter   ~0029356

Sorry, can you elaborate? No function definition is changed, so I wouldn't think this would qualify as a kABI break. I'm probably overlooking something, but just want to make sure...
toracat

toracat

2017-05-30 17:56

manager  

kabibreakageerror.txt (1,667 bytes)
**** GENERATING kernel ABI metadata ****
**** kABI checking is enabled in kernel SPEC file. ****
*** ERROR - ABI BREAKAGE WAS DETECTED ***

The following symbols have been changed (this will cause an ABI breakage):

dev_get_stats
consume_skb
skb_tstamp_tx
skb_queue_purge
kthread_bind
dev_get_by_name
skb_copy_expand
__dev_get_by_name
alloc_netdev_mqs
kill_block_super
netdev_rx_handler_unregister
__netdev_alloc_skb
register_netdevice
xt_unregister_targets
free_netdev
napi_get_frags
netif_napi_add
skb_pad
kthread_stop
napi_gro_frags
netif_device_attach
skb_push
__napi_complete
dev_kfree_skb_irq
skb_dequeue
netlink_broadcast
skb_copy
kfree_skb
dev_set_mac_address
dev_open
napi_gro_receive
kthread_create_on_node
skb_copy_bits
netlink_unicast
debugfs_create_file
skb_queue_head
skb_unlink
netdev_master_upper_dev_get
netif_rx
skb_trim
dev_get_by_index
debugfs_remove
current_task
skb_checksum
netif_device_detach
lock_sock_nested
pskb_expand_head
netdev_rx_handler_register
__skb_gso_segment
__nlmsg_put
unregister_netdev
netif_napi_del
sock_alloc_send_skb
__alloc_skb
unregister_netdevice_queue
__pskb_pull_tail
dev_set_allmulti
___pskb_trim
xt_register_targets
skb_realloc_headroom
netif_set_real_num_tx_queues
netdev_change_features
sk_alloc
dev_set_mtu
release_sock
napi_complete
register_netdev
__napi_schedule
dev_kfree_skb_any
netdev_master_upper_dev_link
debugfs_create_dir
skb_queue_tail
__netif_schedule
skb_partial_csum_set
skb_checksum_help
netdev_update_features
dev_close
netif_receive_skb
skb_clone
netif_rx_ni
sk_free
ip6_route_output
dev_set_promiscuity
__dev_get_by_index
skb_put
skb_pull
dev_queue_xmit
netdev_features_change
skb_dequeue_tail

kabibreakageerror.txt (1,667 bytes)
toracat

toracat

2017-05-30 17:57

manager   ~0029357

I have uploaded a file that shows the actual error. Perhaps you can figure out what's happening.
hallyn

hallyn

2017-06-04 18:36

reporter   ~0029397

Hi,

Sadly I have no idea what is happening. However if I replace the patch with one which only adds the #include <linux/pid_namespace.h> to include/linux/ptrace.h, I get the exact same abi breakage.

Does that ring any bells? I can't fathom why including pid_namespace.h should change the signatures of those functions.

(If I take an ubuntu kernel and *revert* that patch, the ubuntu kabi check does not trigger.)
kabe

kabe

2017-06-09 08:10

reporter   ~0029422

I tried to see what changes when just adding
#include <linux/pid_namespace.h>
in include/linux/ptrace.h does to kABI.

Patch scripts/Makefile.build to invoke "genksyms -d -D" :
======
diff -up ./scripts/Makefile.build.dist ./scripts/Makefile.build
--- ./scripts/Makefile.build.dist 2017-04-22 15:17:16.000000000 +0900
+++ ./scripts/Makefile.build 2017-06-09 16:13:28.625041035 +0900
@@ -211,7 +211,7 @@ $(obj)/%.i: $(src)/%.c FORCE

 cmd_gensymtypes = \
     $(CPP) -D__GENKSYMS__ $(c_flags) $< | \
- $(GENKSYMS) $(if $(1), -T $(2)) \
+ $(GENKSYMS) -d -D $(if $(1), -T $(2)) \
      $(patsubst y,-s _,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX)) \
      $(if $(KBUILD_PRESERVE),-p) \
      -r $(firstword $(wildcard $(2:.symtypes=.symref) /dev/null))
======

then the build will spit out Export definitions like

>> Defn for type0 unregister_netdevice_queue == <void unregister_netdevice_queue ( s#net_device * , s#list_head * ) >
>> Export unregister_netdevice_queue == <void unregister_netdevice_queue ( struct net_device { char name [ 16 ] ;
>> ...

Is it correct that genksyms do a CRC on this output to calculate kABI signature?

Assuming yes, for instance, unregister_netdevice_queue() definition changes like
=====
< struct pid_namespace { UNKNOWN } * ns ;
---
> struct pid_namespace { struct kref kref ;
> struct pidmap { atomic_t nr_free ;
...
> } * ns ;
=====
that is, the added #include makes opaque struct now visible.
Actual ABI didn't change, but kABI signature will be considered different.
kabe

kabe

2017-06-12 03:03

reporter  

patch-13265-PTRACE_GETEVENTMSG.patch (4,181 bytes)
From: "T.kabe" <kabe@>
Subject: ptrace: fix fork event messages across pid namespaces

[upstream commit 4e52365f279564cef0ddd41db5237f0471381093]
Author: Matthew Dempsky <mdempsky@chromium.org>
Date:   Fri Jun 6 14:36:42 2014 -0700

    ptrace: fix fork event messages across pid namespaces
    
    When tracing a process in another pid namespace, it's important for fork
    event messages to contain the child's pid as seen from the tracer's pid
    namespace, not the parent's.  Otherwise, the tracer won't be able to
    correlate the fork event with later SIGTRAP signals it receives from the
    child.
    
    We still risk a race condition if a ptracer from a different pid
    namespace attaches after we compute the pid_t value.  However, sending a
    bogus fork event message in this unlikely scenario is still a vast
    improvement over the status quo where we always send bogus fork event
    messages to debuggers in a different pid namespace than the forking
    process.
    
    Signed-off-by: Matthew Dempsky <mdempsky@chromium.org>
    Acked-by: Oleg Nesterov <oleg@redhat.com>
    Cc: Kees Cook <keescook@chromium.org>
    Cc: Julien Tinnes <jln@chromium.org>
    Cc: Roland McGrath <mcgrathr@chromium.org>
    Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
    Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>

diff -up ./include/linux/ptrace.h.dist ./include/linux/ptrace.h
--- ./include/linux/ptrace.h.dist	2017-04-22 15:17:16.000000000 +0900
+++ ./include/linux/ptrace.h	2017-06-12 09:33:22.753256752 +0900
@@ -129,6 +129,45 @@ static inline void ptrace_event(int even
 }
 
 /**
+ * ptrace_event_pid - possibly stop for a ptrace event notification
+ * @event:	%PTRACE_EVENT_* value to report
+ * @pid:	process identifier for %PTRACE_GETEVENTMSG to return
+ *
+ * Check whether @event is enabled and, if so, report @event and @pid
+ * to the ptrace parent.  @pid is reported as the pid_t seen from the
+ * the ptrace parent's pid namespace.
+ *
+ * Called without locks.
+ */
+static inline void ptrace_event_pid(int event, struct pid *pid)
+{
+	/*#include <linux/pid_namespace.h>	 * For task_active_pid_ns.  */
+	/*
+	 * DANGEROUS kABI USAGE WARNING:
+	 * Including whole <linux/pid_namespace.h> breaks kABI making 
+	 * opaque pointer visible, so revert to just declaring the 
+	 * required function here.
+	 */ 
+	extern struct pid_namespace *task_active_pid_ns(struct task_struct *tsk);
+	/*
+	 * FIXME: There's a potential race if a ptracer in a different pid
+	 * namespace than parent attaches between computing message below and
+	 * when we acquire tasklist_lock in ptrace_stop().  If this happens,
+	 * the ptracer will get a bogus pid from PTRACE_GETEVENTMSG.
+	 */
+	unsigned long message = 0;
+	struct pid_namespace *ns;
+
+	rcu_read_lock();
+	ns = task_active_pid_ns(rcu_dereference(current->parent));
+	if (ns)
+		message = pid_nr_ns(pid, ns);
+	rcu_read_unlock();
+
+	ptrace_event(event, message);
+}
+
+/**
  * ptrace_init_task - initialize ptrace state for a new child
  * @child:		new child task
  * @ptrace:		true if child should be ptrace'd by parent's tracer
diff -up ./kernel/fork.c.dist ./kernel/fork.c
--- ./kernel/fork.c.dist	2017-04-22 15:17:16.000000000 +0900
+++ ./kernel/fork.c	2017-06-12 09:44:02.072915995 +0900
@@ -1718,10 +1718,12 @@ long do_fork(unsigned long clone_flags,
 	 */
 	if (!IS_ERR(p)) {
 		struct completion vfork;
+		struct pid *pid;
 
 		trace_sched_process_fork(current, p);
 
-		nr = task_pid_vnr(p);
+		pid = get_task_pid(p, PIDTYPE_PID);
+		nr = pid_vnr(pid);
 
 		if (clone_flags & CLONE_PARENT_SETTID)
 			put_user(nr, parent_tidptr);
@@ -1736,12 +1738,14 @@ long do_fork(unsigned long clone_flags,
 
 		/* forking complete and child started to run, tell ptracer */
 		if (unlikely(trace))
-			ptrace_event(trace, nr);
+			ptrace_event_pid(trace, pid);
 
 		if (clone_flags & CLONE_VFORK) {
 			if (!wait_for_vfork_done(p, &vfork))
-				ptrace_event(PTRACE_EVENT_VFORK_DONE, nr);
+				ptrace_event_pid(PTRACE_EVENT_VFORK_DONE, pid);
 		}
+
+		put_pid(pid);
 	} else {
 		nr = PTR_ERR(p);
 	}
kabe

kabe

2017-06-12 03:10

reporter   ~0029441

I made up a preliminary patch to apply the upstream commit
without breaking the kABI:
https://bugs.centos.org/file_download.php?file_id=20761&type=bug

This patch declares "extern struct pid_namespace *task_active_pid_ns()"
outside <linux/pid_namespace.h>.
It's very unlikely that this function's API would change, but nontheless
this is a dangerous game.

@hallyn, can you provide a test program? It's not trivial for me to test it out.
hallyn

hallyn

2017-06-12 17:25

reporter   ~0029452

A testcase is attached.

test_event_pidns.c (2,678 bytes)
kabe

kabe

2017-06-13 04:47

reporter   ~0029454

Thanks for the test program.
I confirmed that with stock kernel-3.10.0-514.21.1, the test fails, and
kernel with patch-13265-PTRACE_GETEVENTMSG.patch
https://bugs.centos.org/file_download.php?file_id=20761&type=bug
fixes the problem:

>> # ./a.out
>> 2 9078
>> PASS - returned event included pid in correct namespace


It's very unlikely that declaration of "task_active_pid_ns()" will ever change in
RHEL 7's lifetime, so I consider the patch relatively safe to apply.
@toracat, what would you say?

In unlikely event that upstream diverged the declaration,
compile will correctly detect conflict and compile will fail early as below.
Falls on safe side.

>> from init/main.c:18:
>> include/linux/pid_namespace.h:96:30: error: conflicting types for 'task_active_pid_ns'include/linux/pid_namespace.h:96:30: error: conflicting types for 'task_active_pid_ns'

(you have to semantic-patch (spatch) if the API changes; unlikely in RHEL)
toracat

toracat

2018-01-24 23:20

manager   ~0031020

Just noticed this bug had not been taken care of. My big apologies.

If I'm not mistaken, the patch is now in the current (el7.4) kernel, therefore the issue has been resolved. Could @kabe or @hallyn confirm this please?
kabe

kabe

2018-01-25 02:01

reporter   ~0031026

Nope, the standard CentOS 7.4 kernel-3.10.0-693.11.6 and kernel-3.10.0-693.11.1 still has this problem.
(Does RHEL patch things causing kABI change?)
toracat

toracat

2018-01-25 05:16

manager   ~0031027

What I find is that commit 4e52365f279564cef0ddd41db5237f0471381093 is in the current kernel (3.10.0-693.11.6.el7). It was most likely added when 7.3 was updated to 7.4. Therefore kABI breakage was not a concern.

At any rate, does this imply that the patch does not fix the problem reported?
hallyn

hallyn

2018-01-25 05:30

reporter   ~0031030

Hm, seems fixed here:

Linux centos-s-1vcpu-1gb-nyc1-01 3.10.0-693.11.6.el7.x86_64 #1 SMP Thu Jan 4 01:06:37 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
[root@centos-s-1vcpu-1gb-nyc1-01 ~]# ./test
2 1254
PASS - returned ptrace event included pid in correct namespace
[root@centos-s-1vcpu-1gb-nyc1-01 ~]# ./test
2 1258
PASS - returned ptrace event included pid in correct namespace
toracat

toracat

2018-01-25 05:46

manager   ~0031034

@hallyn

Thanks for the testing. Your result indicates that the patch actually fixed the problem.
toracat

toracat

2018-01-30 02:33

manager   ~0031099

Closing as 'resolved' based on the result by the submitter.

Issue History

Date Modified Username Field Change
2017-05-17 05:16 hallyn New Issue
2017-05-17 21:02 toracat Status new => assigned
2017-05-17 21:07 toracat Note Added: 0029292
2017-05-30 16:55 toracat Note Added: 0029354
2017-05-30 17:03 hallyn Note Added: 0029356
2017-05-30 17:56 toracat File Added: kabibreakageerror.txt
2017-05-30 17:57 toracat File Added: kabibreakageerror-2.txt
2017-05-30 17:57 toracat Note Added: 0029357
2017-05-30 18:00 toracat File Deleted: kabibreakageerror-2.txt
2017-06-04 18:36 hallyn Note Added: 0029397
2017-06-09 08:10 kabe Note Added: 0029422
2017-06-12 03:03 kabe File Added: patch-13265-PTRACE_GETEVENTMSG.patch
2017-06-12 03:10 kabe Note Added: 0029441
2017-06-12 17:25 hallyn File Added: test_event_pidns.c
2017-06-12 17:25 hallyn Note Added: 0029452
2017-06-13 04:47 kabe Note Added: 0029454
2018-01-24 23:20 toracat Note Added: 0031020
2018-01-24 23:28 toracat Status assigned => feedback
2018-01-25 02:01 kabe Note Added: 0031026
2018-01-25 05:16 toracat Note Added: 0031027
2018-01-25 05:30 hallyn Note Added: 0031030
2018-01-25 05:30 hallyn Status feedback => assigned
2018-01-25 05:46 toracat Note Added: 0031034
2018-01-30 02:33 toracat Status assigned => resolved
2018-01-30 02:33 toracat Resolution open => fixed
2018-01-30 02:33 toracat Note Added: 0031099