View Issue Details

IDProjectCategoryView StatusLast Update
0010414CentOS-7kernelpublic2018-01-25 05:41
Reporterlikan 
PrioritynormalSeverityminorReproducibilityalways
Status resolvedResolutionfixed 
Product Version7.2.1511 
Target VersionFixed in Version 
Summary0010414: Can't cleanly unmount a slave --bind mount
DescriptionIf a mount A with submounts is binded to another place B, then set B as slave, then even after unmounting B and all submounts, rmdir B will fail.

Steps To Reproduce[hidden]$ mkdir -p A B/a C/b
[hidden]$ sudo mount --bind A B/a
[hidden]$ sudo mount --rbind B C/b
[hidden]$ sudo mount --make-rslave C/b
[hidden]$ sudo umount C/b/a
[hidden]$ sudo umount C/b
[hidden]$ rmdir C/b
rmdir: failed to remove ‘C/b’: Device or resource busy
Additional InformationThe same steps work on Ubuntu 14.04 and Fedora 23.

Asked on stackexchange but didn't get useful response: http://unix.stackexchange.com/questions/264931/centos-7-unable-to-cleanly-umount-a-bind-mount-using-rbind-and-rslave
Tagsbind-chroot, kernel
abrt_hash
URL

Activities

kabe

kabe

2017-02-28 04:21

reporter   ~0028704

CentOS 7.3.1611 kernel-3.10.0-514 still has this problem.

the LongTerm kernel in ELrepo
http://elrepo.org/linux/kernel/el7/x86_64/RPMS/kernel-lt-4.4.52-1.el7.elrepo.x86_64.rpm
seems to have this fixed.
I wasn't able to find a relevant commit. (need bisect?)

(I couldn't try MainLine kernel-ml, since it fills console with
hv_utils: Using TimeSync version 4.0
)
kabe

kabe

2017-03-03 01:00

reporter   ~0028743

I've narrowed down the problematic kernel versions to:

kernel-3.17.8 does have this problem.
kernel-3.18.0 has this fixed.

Since there's non-trivial differences in fs/namespace.c in
these two, and still more in RHEL 3.10.0-514 versions,
making a patch isn't trivial.

Realistic workround now is to use ELrepo kernels.
toracat

toracat

2017-03-03 16:14

manager   ~0028757

@kabe

Thanks for your investigation. That's a lot of work. As you pointed out, creating a patch does not seem trivial, and the only way to get this fixed is to have our upstream (Red Hat) do it.

Could you or the OP report this bug at http://bugzilla.redhat.com ?
kabe

kabe

2017-03-06 01:07

reporter   ~0028764

I've opened the RHEL bugzilla at
https://bugzilla.redhat.com/show_bug.cgi?id=1429270 (RHBZ#1429270)
toracat

toracat

2017-03-06 01:14

manager   ~0028765

Thanks, @kabe.

Because the upstream BZ is private, please keep us posted with any progress made there.
kabe

kabe

2017-03-09 04:54

reporter  

0001-get-rid-of-propagate_umount-mistakenly-treating-slav.patch (1,960 bytes)
From ba4666388f1a69487e0a64e3626564d37feef2c0 Mon Sep 17 00:00:00 2001
From: "T.kabe" <kabe@>
Date: Fri, 3 Mar 2017 14:29:26 +0900
Subject: [PATCH 1/4] get rid of propagate_umount() mistakenly treating slaves as busy.

[upstream commit 88b368f27a094277143d8ecd5a056116f6a41520]
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Mon Aug 18 15:09:26 2014 -0400

    get rid of propagate_umount() mistakenly treating slaves as busy.

    The check in __propagate_umount() ("has somebody explicitly mounted
    something on that slave?") is done *before* taking the already doomed
    victims out of the child lists.

    Cc: stable@vger.kernel.org
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/namespace.c | 4 +++-
 fs/pnode.c     | 4 +++-
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 9440b6a..0053bfc 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -1195,6 +1195,9 @@ void umount_tree(struct mount *mnt, int propagate)
 	for (p = mnt; p; p = next_mnt(p, mnt))
 		list_move(&p->mnt_hash, &tmp_list);
 
+	list_for_each_entry(p, &tmp_list, mnt_hash)
+		list_del_init(&p->mnt_child);
+
 	if (propagate)
 		propagate_umount(&tmp_list);
 
@@ -1203,7 +1206,6 @@ void umount_tree(struct mount *mnt, int propagate)
 		list_del_init(&p->mnt_list);
 		__touch_mnt_namespace(p->mnt_ns);
 		p->mnt_ns = NULL;
-		list_del_init(&p->mnt_child);
 		if (mnt_has_parent(p)) {
 			p->mnt_parent->mnt_ghosts++;
 			put_mountpoint(p->mnt_mp);
diff --git a/fs/pnode.c b/fs/pnode.c
index 54bd043..983cf93 100644
--- a/fs/pnode.c
+++ b/fs/pnode.c
@@ -390,8 +390,10 @@ static void __propagate_umount(struct mount *mnt)
 		 * umount the child only if the child has no
 		 * other children
 		 */
-		if (child && list_empty(&child->mnt_mounts))
+		if (child && list_empty(&child->mnt_mounts)) {
+			list_del_init(&child->mnt_child);
 			list_move_tail(&child->mnt_hash, &mnt->mnt_hash);
+		}
 	}
 }
 
-- 
1.8.3.1

kabe

kabe

2017-03-09 04:55

reporter  

0002-vfs-Keep-a-list-of-mounts-on-a-mount-point.patch (3,171 bytes)
From 9cc2c0f41c403a05989bc3ca2af650247d452f65 Mon Sep 17 00:00:00 2001
From: "T.kabe" <kabe@>
Date: Mon, 6 Mar 2017 15:18:55 +0900
Subject: [PATCH 2/4] vfs: Keep a list of mounts on a mount point

[upstream commit 0a5eb7c8189922e86a840972cd0b57e41de6f031]
Author: Eric W. Biederman <ebiederman@twitter.com>
Date:   Sun Sep 22 19:37:01 2013 -0700

    vfs: Keep a list of mounts on a mount point

    To spot any possible problems call BUG if a mountpoint
    is put when it's list of mounts is not empty.

    AV: use hlist instead of list_head

    Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: Eric W. Biederman <ebiederman@twitter.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/mount.h     | 2 ++
 fs/namespace.c | 6 ++++++
 2 files changed, 8 insertions(+)

diff --git a/fs/mount.h b/fs/mount.h
index b870800..92fecbe 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -21,6 +21,7 @@ struct mnt_pcp {
 struct mountpoint {
 	struct list_head m_hash;
 	struct dentry *m_dentry;
+	struct hlist_head m_list;
 	int m_count;
 };
 
@@ -47,6 +48,7 @@ struct mount {
 	struct mount *mnt_master;	/* slave is on master->mnt_slave_list */
 	struct mnt_namespace *mnt_ns;	/* containing namespace */
 	struct mountpoint *mnt_mp;	/* where is it mounted */
+	struct hlist_node mnt_mp_list;	/* list mounts with the same mountpoint */
 #ifdef CONFIG_FSNOTIFY
 	struct hlist_head mnt_fsnotify_marks;
 	__u32 mnt_fsnotify_mask;
diff --git a/fs/namespace.c b/fs/namespace.c
index 0053bfc..c238481 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -197,6 +197,7 @@ static struct mount *alloc_vfsmnt(const char *name)
 		INIT_LIST_HEAD(&mnt->mnt_share);
 		INIT_LIST_HEAD(&mnt->mnt_slave_list);
 		INIT_LIST_HEAD(&mnt->mnt_slave);
+		INIT_HLIST_NODE(&mnt->mnt_mp_list);
 #ifdef CONFIG_FSNOTIFY
 		INIT_HLIST_HEAD(&mnt->mnt_fsnotify_marks);
 #endif
@@ -636,6 +637,7 @@ static struct mountpoint *new_mountpoint(struct dentry *dentry)
 	mp->m_dentry = dentry;
 	mp->m_count = 1;
 	list_add(&mp->m_hash, chain);
+	INIT_HLIST_HEAD(&mp->m_list);
 	return mp;
 }
 
@@ -643,6 +645,7 @@ static void put_mountpoint(struct mountpoint *mp)
 {
 	if (!--mp->m_count) {
 		struct dentry *dentry = mp->m_dentry;
+		BUG_ON(!hlist_empty(&mp->m_list));
 		spin_lock(&dentry->d_lock);
 		dentry->d_flags &= ~DCACHE_MOUNTED;
 		spin_unlock(&dentry->d_lock);
@@ -689,6 +692,7 @@ static void detach_mnt(struct mount *mnt, struct path *old_path)
 	mnt->mnt_mountpoint = mnt->mnt.mnt_root;
 	list_del_init(&mnt->mnt_child);
 	list_del_init(&mnt->mnt_hash);
+	hlist_del_init(&mnt->mnt_mp_list);
 	put_mountpoint(mnt->mnt_mp);
 	mnt->mnt_mp = NULL;
 }
@@ -705,6 +709,7 @@ void mnt_set_mountpoint(struct mount *mnt,
 	child_mnt->mnt_mountpoint = dget(mp->m_dentry);
 	child_mnt->mnt_parent = mnt;
 	child_mnt->mnt_mp = mp;
+	hlist_add_head(&child_mnt->mnt_mp_list, &mp->m_list);
 }
 
 /*
@@ -1207,6 +1212,7 @@ void umount_tree(struct mount *mnt, int propagate)
 		__touch_mnt_namespace(p->mnt_ns);
 		p->mnt_ns = NULL;
 		if (mnt_has_parent(p)) {
+			hlist_del_init(&p->mnt_mp_list);
 			p->mnt_parent->mnt_ghosts++;
 			put_mountpoint(p->mnt_mp);
 			p->mnt_mp = NULL;
-- 
1.8.3.1

kabe

kabe

2017-03-09 05:12

reporter  

0003-vfs-Add-a-function-to-lazily-unmount-all-mounts-from.patch (3,813 bytes)
From 0c0bf47e3e77ebff691abe1a5e133e43515e5205 Mon Sep 17 00:00:00 2001
From: "T.kabe" <kabe@>
Date: Mon, 6 Mar 2017 15:21:16 +0900
Subject: [PATCH 3/4] vfs: Add a function to lazily unmount all mounts from any dentry.

[upstream commit 80b5dce8c59b0de1ed6e403b8298e02dcb4db64b]
Author: Eric W. Biederman <ebiederman@twitter.com>
Date:   Thu Oct 3 01:31:18 2013 -0700

    vfs: Add a function to lazily unmount all mounts from any dentry.

    The new function detach_mounts comes in two pieces.  The first piece
    is a static inline test of d_mounpoint that returns immediately
    without taking any locks if d_mounpoint is not set.  In the common
    case when mountpoints are absent this allows the vfs to continue
    running with it's same cacheline foot print.

    The second piece of detach_mounts __detach_mounts actually does the
    work and it assumes that a mountpoint is present so it is slow and
    takes namespace_sem for write, and then locks the mount hash (aka
    mount_lock) after a struct mountpoint has been found.

    With those two locks held each entry on the list of mounts on a
    mountpoint is selected and lazily unmounted until all of the mount
    have been lazily unmounted.

    v7: Wrote a proper change description and removed the changelog
        documenting deleted wrong turns.

    Signed-off-by: Eric W. Biederman <ebiederman@twitter.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/mount.h     |  9 +++++++++
 fs/namespace.c | 48 ++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/fs/mount.h b/fs/mount.h
index 92fecbe..9959119 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -80,6 +80,15 @@ static inline int is_mounted(struct vfsmount *mnt)
 
 extern struct mount *__lookup_mnt(struct vfsmount *, struct dentry *, int);
 
+extern void __detach_mounts(struct dentry *dentry);
+
+static inline void detach_mounts(struct dentry *dentry)
+{
+	if (!d_mountpoint(dentry))
+		return;
+	__detach_mounts(dentry);
+}
+
 static inline void get_mnt_ns(struct mnt_namespace *ns)
 {
 	atomic_inc(&ns->count);
diff --git a/fs/namespace.c b/fs/namespace.c
index c238481..e48fed3 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -608,6 +608,23 @@ struct vfsmount *lookup_mnt(struct path *path)
 	}
 }
 
+static struct mountpoint *lookup_mountpoint(struct dentry *dentry)
+{
+	struct list_head *chain = mountpoint_hashtable + hash(NULL, dentry);
+	struct mountpoint *mp;
+
+	list_for_each_entry(mp, chain, m_hash) {
+		if (mp->m_dentry == dentry) {
+			/* might be worth a WARN_ON() */
+			if (d_unlinked(dentry))
+				return ERR_PTR(-ENOENT);
+			mp->m_count++;
+			return mp;
+		}
+	}
+	return NULL;
+}
+
 static struct mountpoint *new_mountpoint(struct dentry *dentry)
 {
 	struct list_head *chain = mountpoint_hashtable + hash(NULL, dentry);
@@ -1312,6 +1329,37 @@ static int do_umount(struct mount *mnt, int flags)
 	return retval;
 }
 
+/*
+ * __detach_mounts - lazily unmount all mounts on the specified dentry
+ *
+ * During unlink, rmdir, and d_drop it is possible to loose the path
+ * to an existing mountpoint, and wind up leaking the mount.
+ * detach_mounts allows lazily unmounting those mounts instead of
+ * leaking them.
+ *
+ * The caller may hold dentry->d_inode->i_mutex.
+ */
+void __detach_mounts(struct dentry *dentry)
+{
+	struct mountpoint *mp;
+	struct mount *mnt;
+
+	namespace_lock();
+	mp = lookup_mountpoint(dentry);
+	if (!mp)
+		goto out_unlock;
+
+	br_write_lock(&vfsmount_lock);
+	while (!hlist_empty(&mp->m_list)) {
+		mnt = hlist_entry(mp->m_list.first, struct mount, mnt_mp_list);
+		umount_tree(mnt, 2);
+	}
+	br_write_unlock(&vfsmount_lock);
+	put_mountpoint(mp);
+out_unlock:
+	namespace_unlock();
+}
+
 /* 
  * Is the caller allowed to modify his namespace?
  */
-- 
1.8.3.1

kabe

kabe

2017-03-09 05:13

reporter  

0004-vfs-Lazily-remove-mounts-on-unlinked-files-and-direc.patch (12,755 bytes)
From 743848de574b660972f457c28c02cbb19c8aa439 Mon Sep 17 00:00:00 2001
From: "T.kabe" <kabe@>
Date: Fri, 3 Mar 2017 17:06:44 +0900
Subject: [PATCH 4/4] vfs: Lazily remove mounts on unlinked files and directories.

[upstream commit 8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe]
[upstream commit 7af1364ffa64db61e386628594836e13d2ef04b5]

commit 8ed936b5671bfb33d89bc60bdcc7cf0470ba52fe
Author: Eric W. Biederman <ebiederman@twitter.com>
Date:   Tue Oct 1 18:33:48 2013 -0700

    vfs: Lazily remove mounts on unlinked files and directories.

    With the introduction of mount namespaces and bind mounts it became
    possible to access files and directories that on some paths are mount
    points but are not mount points on other paths.  It is very confusing
    when rm -rf somedir returns -EBUSY simply because somedir is mounted
    somewhere else.  With the addition of user namespaces allowing
    unprivileged mounts this condition has gone from annoying to allowing
    a DOS attack on other users in the system.

    The possibility for mischief is removed by updating the vfs to support
    rename, unlink and rmdir on a dentry that is a mountpoint and by
    lazily unmounting mountpoints on deleted dentries.

    In particular this change allows rename, unlink and rmdir system calls
    on a dentry without a mountpoint in the current mount namespace to
    succeed, and it allows rename, unlink, and rmdir performed on a
    distributed filesystem to update the vfs cache even if when there is a
    mount in some namespace on the original dentry.

    There are two common patterns of maintaining mounts: Mounts on trusted
    paths with the parent directory of the mount point and all ancestory
    directories up to / owned by root and modifiable only by root
    (i.e. /media/xxx, /dev, /dev/pts, /proc, /sys, /sys/fs/cgroup/{cpu,
    cpuacct, ...}, /usr, /usr/local).  Mounts on unprivileged directories
    maintained by fusermount.

    In the case of mounts in trusted directories owned by root and
    modifiable only by root the current parent directory permissions are
    sufficient to ensure a mount point on a trusted path is not removed
    or renamed by anyone other than root, even if there is a context
    where the there are no mount points to prevent this.

    In the case of mounts in directories owned by less privileged users
    races with users modifying the path of a mount point are already a
    danger.  fusermount already uses a combination of chdir,
    /proc/<pid>/fd/NNN, and UMOUNT_NOFOLLOW to prevent these races.  The
    removable of global rename, unlink, and rmdir protection really adds
    nothing new to consider only a widening of the attack window, and
    fusermount is already safe against unprivileged users modifying the
    directory simultaneously.

    In principle for perfect userspace programs returning -EBUSY for
    unlink, rmdir, and rename of dentires that have mounts in the local
    namespace is actually unnecessary.  Unfortunately not all userspace
    programs are perfect so retaining -EBUSY for unlink, rmdir and rename
    of dentries that have mounts in the current mount namespace plays an
    important role of maintaining consistency with historical behavior and
    making imperfect userspace applications hard to exploit.

    v2: Remove spurious old_dentry.
    v3: Optimized shrink_submounts_and_drop
        Removed unsued afs label
    v4: Simplified the changes to check_submounts_and_drop
        Do not rename check_submounts_and_drop shrink_submounts_and_drop
        Document what why we need atomicity in check_submounts_and_drop
        Rely on the parent inode mutex to make d_revalidate and d_invalidate
        an atomic unit.
    v5: Refcount the mountpoint to detach in case of simultaneous
        renames.

    Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

commit 7af1364ffa64db61e386628594836e13d2ef04b5
Author: Eric W. Biederman <ebiederm@xmission.com>
Date:   Fri Oct 4 19:15:13 2013 -0700

    vfs: Don't allow overwriting mounts in the current mount namespace

    In preparation for allowing mountpoints to be renamed and unlinked
    in remote filesystems and in other mount namespaces test if on a dentry
    there is a mount in the local mount namespace before allowing it to
    be renamed or unlinked.

    The primary motivation here are old versions of fusermount unmount
    which is not safe if the a path can be renamed or unlinked while it is
    verifying the mount is safe to unmount.  More recent versions are simpler
    and safer by simply using UMOUNT_NOFOLLOW when unmounting a mount
    in a directory owned by an arbitrary user.

    Miklos Szeredi <miklos@szeredi.hu> reports this is approach is good
    enough to remove concerns about new kernels mixed with old versions
    of fusermount.

    A secondary motivation for restrictions here is that it removing empty
    directories that have non-empty mount points on them appears to
    violate the rule that rmdir can not remove empty directories.  As
    Linus Torvalds pointed out this is useful for programs (like git) that
    test if a directory is empty with rmdir.

    Therefore this patch arranges to enforce the existing mount point
    semantics for local mount namespace.

    v2: Rewrote the test to be a drop in replacement for d_mountpoint
    v3: Use bool instead of int as the return type of is_local_mountpoint

    Reviewed-by: Miklos Szeredi <miklos@szeredi.hu>
    Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com>
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
 fs/dcache.c    | 69 +++++++++++++++++++++++++++++++---------------------------
 fs/mount.h     |  9 ++++++++
 fs/namei.c     | 16 +++++++++-----
 fs/namespace.c | 35 +++++++++++++++++++++++++++++
 4 files changed, 91 insertions(+), 38 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index 5dabe0e..a3e9e7a 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1285,36 +1285,38 @@ void shrink_dcache_parent(struct dentry *parent)
 }
 EXPORT_SYMBOL(shrink_dcache_parent);
 
-static enum d_walk_ret check_and_collect(void *_data, struct dentry *dentry)
+struct detach_data {
+	struct select_data select;
+	struct dentry *mountpoint;
+};
+static enum d_walk_ret detach_and_collect(void *_data, struct dentry *dentry)
 {
-	struct select_data *data = _data;
-
-	if (d_mountpoint(dentry)) {
-		data->found = -EBUSY;
-		return D_WALK_QUIT;
-	}
+	struct detach_data *data = _data;
 
-	return select_collect(_data, dentry);
-}
+ 	if (d_mountpoint(dentry)) {
+		__dget_dlock(dentry);
+		data->mountpoint = dentry;
+ 		return D_WALK_QUIT;
+ 	}
+	return select_collect(&data->select, dentry);
+ }
 
 static void check_and_drop(void *_data)
 {
-	struct select_data *data = _data;
+	struct detach_data *data = _data;
 
-	if (d_mountpoint(data->start))
-		data->found = -EBUSY;
-	if (!data->found)
-		__d_drop(data->start);
+	if (!data->mountpoint && !data->select.found)
+		__d_drop(data->select.start);
 }
 
 /**
- * check_submounts_and_drop - prune dcache, check for submounts and drop
+ * check_submounts_and_drop - detach submounts, prune dcache, and drop
  *
- * All done as a single atomic operation relative to has_unlinked_ancestor().
- * Returns 0 if successfully unhashed @parent.  If there were submounts then
- * return -EBUSY.
+ * The final d_drop is done as an atomic operation relative to
+ * rename_lock ensuring there are no races with d_set_mounted.  This
+ * ensures there are no unhashed dentries on the path to a mountpoint.
  *
- * @dentry: dentry to prune and drop
+ * @dentry: dentry to detach, prune and drop
  */
 int check_submounts_and_drop(struct dentry *dentry)
 {
@@ -1327,19 +1329,24 @@ int check_submounts_and_drop(struct dentry *dentry)
 	}
 
 	for (;;) {
-		struct select_data data;
+		struct detach_data data;
 
-		INIT_LIST_HEAD(&data.dispose);
-		data.start = dentry;
-		data.found = 0;
+		data.mountpoint = NULL;
+		INIT_LIST_HEAD(&data.select.dispose);
+		data.select.start = dentry;
+		data.select.found = 0;
 
-		d_walk(dentry, &data, check_and_collect, check_and_drop);
-		ret = data.found;
+		d_walk(dentry, &data, detach_and_collect, check_and_drop);
 
-		if (!list_empty(&data.dispose))
-			shrink_dentry_list(&data.dispose);
+		if (data.select.found)
+			shrink_dentry_list(&data.select.dispose);
 
-		if (ret <= 0)
+		if (data.mountpoint) {
+			detach_mounts(data.mountpoint);
+			dput(data.mountpoint);
+		}
+
+		if (!data.mountpoint && !data.select.found)
 			break;
 
 		cond_resched();
@@ -2554,10 +2561,8 @@ static struct dentry *__d_unalias(struct inode *inode,
 		goto out_err;
 	m2 = &alias->d_parent->d_inode->i_mutex;
 out_unalias:
-	if (likely(!d_mountpoint(alias))) {
-		__d_move(alias, dentry, false);
-		ret = alias;
-	}
+	__d_move(alias, dentry, false);
+	ret = alias;
 out_err:
 	spin_unlock(&inode->i_lock);
 	if (m2)
diff --git a/fs/mount.h b/fs/mount.h
index 9959119..a373c86 100644
--- a/fs/mount.h
+++ b/fs/mount.h
@@ -107,3 +107,12 @@ struct proc_mounts {
 #define proc_mounts(p) (container_of((p), struct proc_mounts, m))
 
 extern const struct seq_operations mounts_op;
+
+extern bool __is_local_mountpoint(struct dentry *dentry);
+static inline bool is_local_mountpoint(struct dentry *dentry)
+{
+	if (!d_mountpoint(dentry))
+		return false;
+
+	return __is_local_mountpoint(dentry);
+}
diff --git a/fs/namei.c b/fs/namei.c
index 872e5e5..ef70aa8 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3691,8 +3691,8 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 	mutex_lock(&dentry->d_inode->i_mutex);
 
 	error = -EBUSY;
-	if (d_mountpoint(dentry))
-		goto out;
+ 	if (is_local_mountpoint(dentry))
+ 		goto out;
 
 	error = security_inode_rmdir(dir, dentry);
 	if (error)
@@ -3705,6 +3705,7 @@ int vfs_rmdir(struct inode *dir, struct dentry *dentry)
 
 	dentry->d_inode->i_flags |= S_DEAD;
 	dont_mount(dentry);
+	detach_mounts(dentry);
 
 out:
 	mutex_unlock(&dentry->d_inode->i_mutex);
@@ -3806,7 +3807,7 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 		return -EPERM;
 
 	mutex_lock(&target->i_mutex);
-	if (d_mountpoint(dentry))
+	if (is_local_mountpoint(dentry))
 		error = -EBUSY;
 	else {
 		error = security_inode_unlink(dir, dentry);
@@ -3815,8 +3816,10 @@ int vfs_unlink(struct inode *dir, struct dentry *dentry, struct inode **delegate
 			if (error)
 				goto out;
 			error = dir->i_op->unlink(dir, dentry);
-			if (!error)
+			if (!error) {
 				dont_mount(dentry);
+				detach_mounts(dentry);
+			}
 		}
 	}
 out:
@@ -4254,8 +4257,8 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		mutex_lock(&target->i_mutex);
 
 	error = -EBUSY;
-	if (d_mountpoint(old_dentry) || d_mountpoint(new_dentry))
-		goto out;
+ 	if (is_local_mountpoint(old_dentry) || is_local_mountpoint(new_dentry))
+ 		goto out;
 
 	if (max_links && new_dir != old_dir) {
 		error = -EMLINK;
@@ -4292,6 +4295,7 @@ int vfs_rename(struct inode *old_dir, struct dentry *old_dentry,
 		if (is_dir)
 			target->i_flags |= S_DEAD;
 		dont_mount(new_dentry);
+		detach_mounts(new_dentry);
 	}
 	if (!(old_dir->i_sb->s_type->fs_flags & FS_RENAME_DOES_D_MOVE)) {
 		if (!(flags & RENAME_EXCHANGE))
diff --git a/fs/namespace.c b/fs/namespace.c
index e48fed3..d633562 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -625,6 +625,41 @@ static struct mountpoint *lookup_mountpoint(struct dentry *dentry)
 	return NULL;
 }
 
+/*
+ * __is_local_mountpoint - Test to see if dentry is a mountpoint in the
+ *                         current mount namespace.
+ *
+ * The common case is dentries are not mountpoints at all and that
+ * test is handled inline.  For the slow case when we are actually
+ * dealing with a mountpoint of some kind, walk through all of the
+ * mounts in the current mount namespace and test to see if the dentry
+ * is a mountpoint.
+ *
+ * The mount_hashtable is not usable in the context because we
+ * need to identify all mounts that may be in the current mount
+ * namespace not just a mount that happens to have some specified
+ * parent mount.
+ */
+bool __is_local_mountpoint(struct dentry *dentry)
+{
+	struct mnt_namespace *ns = current->nsproxy->mnt_ns;
+	struct mount *mnt;
+	bool is_covered = false;
+
+	if (!d_mountpoint(dentry))
+		goto out;
+
+	down_read(&namespace_sem);
+	list_for_each_entry(mnt, &ns->list, mnt_list) {
+		is_covered = (mnt->mnt_mountpoint == dentry);
+		if (is_covered)
+			break;
+	}
+	up_read(&namespace_sem);
+out:
+	return is_covered;
+}
+
 static struct mountpoint *new_mountpoint(struct dentry *dentry)
 {
 	struct list_head *chain = mountpoint_hashtable + hash(NULL, dentry);
-- 
1.8.3.1

kabe

kabe

2017-03-09 05:22

reporter   ~0028800

Red Hat bugzilla is still silent; I'm waiting for their action.

Meanwhile I came up with a patches that fixes this issue
loosely resembling upstream commits:
https://bugs.centos.org/file_download.php?file_id=19195&type=bug
https://bugs.centos.org/file_download.php?file_id=19196&type=bug
https://bugs.centos.org/file_download.php?file_id=19197&type=bug
https://bugs.centos.org/file_download.php?file_id=19198&type=bug

(darn, Mantis shows uploaded patches as full plaintext in the browser)

I don't want this used without any reviews or test.
Anyone out there brave enough to try these patches?
toracat

toracat

2017-03-09 12:54

manager   ~0028804

@kabe,

Thanks again for the hard work. I will try getting your patches into the plus kernel and make it available for testing.
toracat

toracat

2017-03-09 14:40

manager   ~0028806

The test kernel-plus set is here:

https://people.centos.org/toracat/kernel/7/plus/bug10414/
kabe

kabe

2017-09-30 02:04

reporter   ~0030262

RHEL Bugzilla says this is fixed in RHEL 7.4, and closed as duplicate.

> --- Comment #8 from Miklos Szeredi <mszeredi@redhat.com> ---
> This has been fixed in rhel-7.4.
>
> Needs "echo 1 > /proc/sys/fs/may_detach_mounts" to enable the new behavior of
> rmdir.

Note: I haven't tested whether this was really fixed in 7.4.
toracat

toracat

2018-01-25 05:41

manager   ~0031033

Closing as resolved. If the problem persists, feel free to reopen.

Issue History

Date Modified Username Field Change
2016-02-22 04:25 likan New Issue
2016-02-22 04:26 likan Tag Attached: bind-chroot
2016-02-22 04:26 likan Tag Attached: kernel
2017-02-28 04:21 kabe Note Added: 0028704
2017-03-03 01:00 kabe Note Added: 0028743
2017-03-03 16:14 toracat Status new => confirmed
2017-03-03 16:14 toracat Note Added: 0028757
2017-03-06 01:07 kabe Note Added: 0028764
2017-03-06 01:14 toracat Status confirmed => assigned
2017-03-06 01:14 toracat Note Added: 0028765
2017-03-09 04:54 kabe File Added: 0001-get-rid-of-propagate_umount-mistakenly-treating-slav.patch
2017-03-09 04:55 kabe File Added: 0002-vfs-Keep-a-list-of-mounts-on-a-mount-point.patch
2017-03-09 05:12 kabe File Added: 0003-vfs-Add-a-function-to-lazily-unmount-all-mounts-from.patch
2017-03-09 05:13 kabe File Added: 0004-vfs-Lazily-remove-mounts-on-unlinked-files-and-direc.patch
2017-03-09 05:22 kabe Note Added: 0028800
2017-03-09 12:54 toracat Note Added: 0028804
2017-03-09 14:40 toracat Note Added: 0028806
2017-09-30 02:04 kabe Note Added: 0030262
2018-01-25 05:41 toracat Status assigned => resolved
2018-01-25 05:41 toracat Resolution open => fixed
2018-01-25 05:41 toracat Note Added: 0031033