2017-12-14 02:26 UTC

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0013265CentOS-7kernelpublic2017-06-13 04:47
Reporterhallyn 
PrioritynormalSeveritymajorReproducibilityalways
StatusassignedResolutionopen 
PlatformAllOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0013265: PTRACE_GETEVENTMSG broken for pids in pidnamespaces
DescriptionIf you fork a child, which unshares CLONE_NEWPID, ptrace-attach to it, set PTRACE_O_TRACEFORK, then have the child fork, the ptrace eventmsg you receive will be the pid in the child's namespace, not yours. This is fixed by commit 4e52365f279564cef0ddd41db5237f0471381093 from 2014 which applies cleanly to the 3.10 kernel.

It was suggested on irc that i might ask for toracat to cherrypick the patch into the centosplus kernel,
Steps To ReproduceI can provide a test program, but the above desribes how to do it.
TagsNo tags attached.
abrt_hash
URL
Attached Files
  • txt file icon kabibreakageerror.txt (1,667 bytes) 2017-05-30 17:56 -
    **** GENERATING kernel ABI metadata ****
    **** kABI checking is enabled in kernel SPEC file. ****
    *** ERROR - ABI BREAKAGE WAS DETECTED ***
    
    The following symbols have been changed (this will cause an ABI breakage):
    
    dev_get_stats
    consume_skb
    skb_tstamp_tx
    skb_queue_purge
    kthread_bind
    dev_get_by_name
    skb_copy_expand
    __dev_get_by_name
    alloc_netdev_mqs
    kill_block_super
    netdev_rx_handler_unregister
    __netdev_alloc_skb
    register_netdevice
    xt_unregister_targets
    free_netdev
    napi_get_frags
    netif_napi_add
    skb_pad
    kthread_stop
    napi_gro_frags
    netif_device_attach
    skb_push
    __napi_complete
    dev_kfree_skb_irq
    skb_dequeue
    netlink_broadcast
    skb_copy
    kfree_skb
    dev_set_mac_address
    dev_open
    napi_gro_receive
    kthread_create_on_node
    skb_copy_bits
    netlink_unicast
    debugfs_create_file
    skb_queue_head
    skb_unlink
    netdev_master_upper_dev_get
    netif_rx
    skb_trim
    dev_get_by_index
    debugfs_remove
    current_task
    skb_checksum
    netif_device_detach
    lock_sock_nested
    pskb_expand_head
    netdev_rx_handler_register
    __skb_gso_segment
    __nlmsg_put
    unregister_netdev
    netif_napi_del
    sock_alloc_send_skb
    __alloc_skb
    unregister_netdevice_queue
    __pskb_pull_tail
    dev_set_allmulti
    ___pskb_trim
    xt_register_targets
    skb_realloc_headroom
    netif_set_real_num_tx_queues
    netdev_change_features
    sk_alloc
    dev_set_mtu
    release_sock
    napi_complete
    register_netdev
    __napi_schedule
    dev_kfree_skb_any
    netdev_master_upper_dev_link
    debugfs_create_dir
    skb_queue_tail
    __netif_schedule
    skb_partial_csum_set
    skb_checksum_help
    netdev_update_features
    dev_close
    netif_receive_skb
    skb_clone
    netif_rx_ni
    sk_free
    ip6_route_output
    dev_set_promiscuity
    __dev_get_by_index
    skb_put
    skb_pull
    dev_queue_xmit
    netdev_features_change
    skb_dequeue_tail
    
    
    txt file icon kabibreakageerror.txt (1,667 bytes) 2017-05-30 17:56 +
  • patch file icon patch-13265-PTRACE_GETEVENTMSG.patch (4,181 bytes) 2017-06-12 03:03 -
    From: "T.kabe" <kabe@>
    Subject: ptrace: fix fork event messages across pid namespaces
    
    [upstream commit 4e52365f279564cef0ddd41db5237f0471381093]
    Author: Matthew Dempsky <mdempsky@chromium.org>
    Date:   Fri Jun 6 14:36:42 2014 -0700
    
        ptrace: fix fork event messages across pid namespaces
        
        When tracing a process in another pid namespace, it's important for fork
        event messages to contain the child's pid as seen from the tracer's pid
        namespace, not the parent's.  Otherwise, the tracer won't be able to
        correlate the fork event with later SIGTRAP signals it receives from the
        child.
        
        We still risk a race condition if a ptracer from a different pid
        namespace attaches after we compute the pid_t value.  However, sending a
        bogus fork event message in this unlikely scenario is still a vast
        improvement over the status quo where we always send bogus fork event
        messages to debuggers in a different pid namespace than the forking
        process.
        
        Signed-off-by: Matthew Dempsky <mdempsky@chromium.org>
        Acked-by: Oleg Nesterov <oleg@redhat.com>
        Cc: Kees Cook <keescook@chromium.org>
        Cc: Julien Tinnes <jln@chromium.org>
        Cc: Roland McGrath <mcgrathr@chromium.org>
        Cc: Jan Kratochvil <jan.kratochvil@redhat.com>
        Cc: <stable@vger.kernel.org>
        Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
        Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
    
    diff -up ./include/linux/ptrace.h.dist ./include/linux/ptrace.h
    --- ./include/linux/ptrace.h.dist	2017-04-22 15:17:16.000000000 +0900
    +++ ./include/linux/ptrace.h	2017-06-12 09:33:22.753256752 +0900
    @@ -129,6 +129,45 @@ static inline void ptrace_event(int even
     }
     
     /**
    + * ptrace_event_pid - possibly stop for a ptrace event notification
    + * @event:	%PTRACE_EVENT_* value to report
    + * @pid:	process identifier for %PTRACE_GETEVENTMSG to return
    + *
    + * Check whether @event is enabled and, if so, report @event and @pid
    + * to the ptrace parent.  @pid is reported as the pid_t seen from the
    + * the ptrace parent's pid namespace.
    + *
    + * Called without locks.
    + */
    +static inline void ptrace_event_pid(int event, struct pid *pid)
    +{
    +	/*#include <linux/pid_namespace.h>	 * For task_active_pid_ns.  */
    +	/*
    +	 * DANGEROUS kABI USAGE WARNING:
    +	 * Including whole <linux/pid_namespace.h> breaks kABI making 
    +	 * opaque pointer visible, so revert to just declaring the 
    +	 * required function here.
    +	 */ 
    +	extern struct pid_namespace *task_active_pid_ns(struct task_struct *tsk);
    +	/*
    +	 * FIXME: There's a potential race if a ptracer in a different pid
    +	 * namespace than parent attaches between computing message below and
    +	 * when we acquire tasklist_lock in ptrace_stop().  If this happens,
    +	 * the ptracer will get a bogus pid from PTRACE_GETEVENTMSG.
    +	 */
    +	unsigned long message = 0;
    +	struct pid_namespace *ns;
    +
    +	rcu_read_lock();
    +	ns = task_active_pid_ns(rcu_dereference(current->parent));
    +	if (ns)
    +		message = pid_nr_ns(pid, ns);
    +	rcu_read_unlock();
    +
    +	ptrace_event(event, message);
    +}
    +
    +/**
      * ptrace_init_task - initialize ptrace state for a new child
      * @child:		new child task
      * @ptrace:		true if child should be ptrace'd by parent's tracer
    diff -up ./kernel/fork.c.dist ./kernel/fork.c
    --- ./kernel/fork.c.dist	2017-04-22 15:17:16.000000000 +0900
    +++ ./kernel/fork.c	2017-06-12 09:44:02.072915995 +0900
    @@ -1718,10 +1718,12 @@ long do_fork(unsigned long clone_flags,
     	 */
     	if (!IS_ERR(p)) {
     		struct completion vfork;
    +		struct pid *pid;
     
     		trace_sched_process_fork(current, p);
     
    -		nr = task_pid_vnr(p);
    +		pid = get_task_pid(p, PIDTYPE_PID);
    +		nr = pid_vnr(pid);
     
     		if (clone_flags & CLONE_PARENT_SETTID)
     			put_user(nr, parent_tidptr);
    @@ -1736,12 +1738,14 @@ long do_fork(unsigned long clone_flags,
     
     		/* forking complete and child started to run, tell ptracer */
     		if (unlikely(trace))
    -			ptrace_event(trace, nr);
    +			ptrace_event_pid(trace, pid);
     
     		if (clone_flags & CLONE_VFORK) {
     			if (!wait_for_vfork_done(p, &vfork))
    -				ptrace_event(PTRACE_EVENT_VFORK_DONE, nr);
    +				ptrace_event_pid(PTRACE_EVENT_VFORK_DONE, pid);
     		}
    +
    +		put_pid(pid);
     	} else {
     		nr = PTR_ERR(p);
     	}
    
    patch file icon patch-13265-PTRACE_GETEVENTMSG.patch (4,181 bytes) 2017-06-12 03:03 +
  • c file icon test_event_pidns.c (2,678 bytes) 2017-06-12 17:25

-Relationships
+Relationships

-Notes

~0029292

toracat (manager)

I will try applying the patch in the next plus kernel update.

~0029354

toracat (manager)

Turns out, unfortunately, applying the patch causes a kABI breakage. So we cannot fulfill the request.

~0029356

hallyn (reporter)

Sorry, can you elaborate? No function definition is changed, so I wouldn't think this would qualify as a kABI break. I'm probably overlooking something, but just want to make sure...

~0029357

toracat (manager)

I have uploaded a file that shows the actual error. Perhaps you can figure out what's happening.

~0029397

hallyn (reporter)

Hi,

Sadly I have no idea what is happening. However if I replace the patch with one which only adds the #include <linux/pid_namespace.h> to include/linux/ptrace.h, I get the exact same abi breakage.

Does that ring any bells? I can't fathom why including pid_namespace.h should change the signatures of those functions.

(If I take an ubuntu kernel and *revert* that patch, the ubuntu kabi check does not trigger.)

~0029422

kabe (reporter)

I tried to see what changes when just adding
#include <linux/pid_namespace.h>
in include/linux/ptrace.h does to kABI.

Patch scripts/Makefile.build to invoke "genksyms -d -D" :
======
diff -up ./scripts/Makefile.build.dist ./scripts/Makefile.build
--- ./scripts/Makefile.build.dist 2017-04-22 15:17:16.000000000 +0900
+++ ./scripts/Makefile.build 2017-06-09 16:13:28.625041035 +0900
@@ -211,7 +211,7 @@ $(obj)/%.i: $(src)/%.c FORCE

 cmd_gensymtypes = \
     $(CPP) -D__GENKSYMS__ $(c_flags) $< | \
- $(GENKSYMS) $(if $(1), -T $(2)) \
+ $(GENKSYMS) -d -D $(if $(1), -T $(2)) \
      $(patsubst y,-s _,$(CONFIG_HAVE_UNDERSCORE_SYMBOL_PREFIX)) \
      $(if $(KBUILD_PRESERVE),-p) \
      -r $(firstword $(wildcard $(2:.symtypes=.symref) /dev/null))
======

then the build will spit out Export definitions like

>> Defn for type0 unregister_netdevice_queue == <void unregister_netdevice_queue ( s#net_device * , s#list_head * ) >
>> Export unregister_netdevice_queue == <void unregister_netdevice_queue ( struct net_device { char name [ 16 ] ;
>> ...

Is it correct that genksyms do a CRC on this output to calculate kABI signature?

Assuming yes, for instance, unregister_netdevice_queue() definition changes like
=====
< struct pid_namespace { UNKNOWN } * ns ;
---
> struct pid_namespace { struct kref kref ;
> struct pidmap { atomic_t nr_free ;
...
> } * ns ;
=====
that is, the added #include makes opaque struct now visible.
Actual ABI didn't change, but kABI signature will be considered different.

~0029441

kabe (reporter)

I made up a preliminary patch to apply the upstream commit
without breaking the kABI:
https://bugs.centos.org/file_download.php?file_id=20761&type=bug

This patch declares "extern struct pid_namespace *task_active_pid_ns()"
outside <linux/pid_namespace.h>.
It's very unlikely that this function's API would change, but nontheless
this is a dangerous game.

@hallyn, can you provide a test program? It's not trivial for me to test it out.

~0029452

hallyn (reporter)

A testcase is attached.

~0029454

kabe (reporter)

Thanks for the test program.
I confirmed that with stock kernel-3.10.0-514.21.1, the test fails, and
kernel with patch-13265-PTRACE_GETEVENTMSG.patch
https://bugs.centos.org/file_download.php?file_id=20761&type=bug
fixes the problem:

>> # ./a.out
>> 2 9078
>> PASS - returned event included pid in correct namespace


It's very unlikely that declaration of "task_active_pid_ns()" will ever change in
RHEL 7's lifetime, so I consider the patch relatively safe to apply.
@toracat, what would you say?

In unlikely event that upstream diverged the declaration,
compile will correctly detect conflict and compile will fail early as below.
Falls on safe side.

>> from init/main.c:18:
>> include/linux/pid_namespace.h:96:30: error: conflicting types for 'task_active_pid_ns'include/linux/pid_namespace.h:96:30: error: conflicting types for 'task_active_pid_ns'

(you have to semantic-patch (spatch) if the API changes; unlikely in RHEL)
+Notes

-Issue History
Date Modified Username Field Change
2017-05-17 05:16 hallyn New Issue
2017-05-17 21:02 toracat Status new => assigned
2017-05-17 21:07 toracat Note Added: 0029292
2017-05-30 16:55 toracat Note Added: 0029354
2017-05-30 17:03 hallyn Note Added: 0029356
2017-05-30 17:56 toracat File Added: kabibreakageerror.txt
2017-05-30 17:57 toracat File Added: kabibreakageerror-2.txt
2017-05-30 17:57 toracat Note Added: 0029357
2017-05-30 18:00 toracat File Deleted: kabibreakageerror-2.txt
2017-06-04 18:36 hallyn Note Added: 0029397
2017-06-09 08:10 kabe Note Added: 0029422
2017-06-12 03:03 kabe File Added: patch-13265-PTRACE_GETEVENTMSG.patch
2017-06-12 03:10 kabe Note Added: 0029441
2017-06-12 17:25 hallyn File Added: test_event_pidns.c
2017-06-12 17:25 hallyn Note Added: 0029452
2017-06-13 04:47 kabe Note Added: 0029454
+Issue History