View Issue Details
ID | Project | Category | View Status | Date Submitted | Last Update |
---|---|---|---|---|---|
0002448 | CentOS-5 | kernel | public | 2007-11-17 15:01 | 2007-12-02 19:24 |
Reporter | arrfab | ||||
Priority | normal | Severity | major | Reproducibility | always |
Status | resolved | Resolution | fixed | ||
Product Version | 5.1 | ||||
Target Version | Fixed in Version | 5.1 | |||
Summary | 0002448: autofs failing on first access to a nfs server | ||||
Description | When using yum pointing to a nfs central server holding the updates repository automounted through nfs, yum fails . 'File doesn't exist' is the answer. It seems kernel related and i found a bug upstream : https://bugzilla.redhat.com/show_bug.cgi?id=377661 | ||||
Tags | No tags attached. | ||||
The -56.el5 kernel from http://people.redhat.com/jlayton/ fixed the problem. Tested on both i686 and x86_64. Akemi |
|
2007-11-20 10:01
|
linux-2.6-autofs4-fix-race-between-mount-and-expire-2.patch (5,171 bytes)
From: Ian Kent <ikent@redhat.com> Subject: Re: [RHEL 5.1 PATCH 1/2] autofs4 - patch correction - fix race between mount and expire Date: Tue, 02 Oct 2007 01:52:58 +0800 Bugzilla: 354621 Message-Id: <1191261178.23256.9.camel@raven.themaw.net> Changelog: [autofs4] fix race between mount and expire On Wed, 2007-04-18 at 16:04 +0800, Ian Kent wrote: > Hi all, > > Investigation of bug 174821 lead to the discovery of a race between > mount and expire. This issue is also present in RHEL5 so I created the > tracking bug 236875 as a clone of 174821. The attached patch is included > in the 2.6.21 rc series at present. > > Explaination. > > What happens is that during an expire the situation can arise that a > directory is removed and another lookup is done before the expire issues > a completion status to the kernel module. In this case, since the the > lookup gets a new dentry, it doesn't know that there is an expire in > progress and when it posts its mount request, matches the existing > expire request and waits for its completion. ENOENT is then returned to > user space from lookup (as the dentry passed in remains negative) > without having performed the mount request. > > The solution is to keep track of dentrys in this unhashed state and > reuse them, if possible, in order to preserve the flags. During the QA of the above bug, 236875 a couple of problems were uncovered. Somehow, while posted to the bug, two patches didn't make it into the kernel. A failure due to this was discovered during scheduled regression testing of autofs and I've since verified these missing patches resolve the problem. This patch is the first of the two patches. Quoting from the bug: Due to a problem uncovered during QA of this patch for a RHEL-4 Z-Stream update I've had to revisit this issue. There are a couple of patches now that depend on this patch and there is a risk of some confusion regarding the various patches. To try and avoid this we should be able to use the same patches everywhere so we need to sync the source of the various kernels with upstream. This patch wasn't needed for this originally but is now needed by the fix for the problem identified above during QA and for other bugs that depend on these patches (for example see bug #253231). Ian --- --- linux-2.6.18.noarch/fs/autofs4/root.c.lookup-check-unhashed 2007-08-22 18:37:11.000000000 +0800 +++ linux-2.6.18.noarch/fs/autofs4/root.c 2007-08-22 18:42:40.000000000 +0800 @@ -655,14 +655,29 @@ static struct dentry *autofs4_lookup(str /* * If this dentry is unhashed, then we shouldn't honour this - * lookup even if the dentry is positive. Returning ENOENT here - * doesn't do the right thing for all system calls, but it should - * be OK for the operations we permit from an autofs. + * lookup. Returning ENOENT here doesn't do the right thing + * for all system calls, but it should be OK for the operations + * we permit from an autofs. */ if (dentry->d_inode && d_unhashed(dentry)) { + /* + * A user space application can (and has done in the past) + * remove and re-create this directory during the callback. + * This can leave us with an unhashed dentry, but a + * successful mount! So we need to perform another + * cached lookup in case the dentry now exists. + */ + struct dentry *parent = dentry->d_parent; + struct dentry *new = d_lookup(parent, &dentry->d_name); + if (new != NULL) + dentry = new; + else + dentry = ERR_PTR(-ENOENT); + if (unhashed) dput(unhashed); - return ERR_PTR(-ENOENT); + + return dentry; } if (unhashed) This patch is the second of the two patches. Quoting from the bug: This patch fixes a fail reported during QA testing for a Z-Stream release for RHEL 4. It is in fact a hunk from another autofs4 patch that resolves a deadlock during directory creation under load (see bug #253231 for info). The deadlock patch delays hashing of dentrys at directory creation until the actual create operation and so dentrys remain unhashed for a relatively long time so the code in this patch was needed their. With the expire/mount race fix here, dentrys are unhashed for a relatively brief time so the code in this patch was not identified as needed during development. However, if there are many process concurrently accessing directories it's possible there will be two or more waiters in the queue. Only one of the waiters will have the dentry required to complete the lookup and the others need to perform a d_lookup to get the correct dentry. This patch allows these processes to perform the needed d_lookup. Ian --- --- linux-2.6.18.noarch/fs/autofs4/root.c.lookup-expire-race-fix-4 2007-08-27 19:29:13.000000000 +0800 +++ linux-2.6.18.noarch/fs/autofs4/root.c 2007-08-27 19:31:13.000000000 +0800 @@ -659,7 +659,7 @@ static struct dentry *autofs4_lookup(str * for all system calls, but it should be OK for the operations * we permit from an autofs. */ - if (dentry->d_inode && d_unhashed(dentry)) { + if (!oz_mode && d_unhashed(dentry)) { /* * A user space application can (and has done in the past) * remove and re-create this directory during the callback. |
The -56.el5 kernel has a problem with nfs (see the bugzilla in the original report) and Jeff Layton is currently working on it. However, his revised version will *not* include the autofs patch (as per his e-mail). According to this bugzilla upstream: https://bugzilla.redhat.com/show_bug.cgi?id=371341 the autofs problem is a known issue and will be fixed in 5.1.x. A patch is provided in that BZ and is now attached here. I confirm that the patch solves the problem. Akemi |
|
The patch tested on both i686 and x86_64. autofs worked as expected and no nfs crash. Akemi |
|
Another workaround is to set DEFAULT_BROWSE_MODE="yes" in /etc/sysconfig/autofs. This only works if your auto.home map explicitly lists every entry, i.e., it does NOT use wildcards like * server:/export/home/& (taken from the bugzilla referred to in 6344; confirmed to work -Akemi) |
|
A kernel update (2.6.18-53.1.4.el5) is out today which presumably fixes the autofs issue. Not confirmed (yet). Akemi |
|
Confirmed that kernel 2.6.18-53.1.4.el5 has the autofs patch and the problem reported here has been fixed. This was tested with x86_64 (as of Nov 29, 2000UTC) Akemi |
|
Tested with the i686 kernel and confirmed the problem is gone. Akemi |
|
Well ... I am the only person adding notes to this report. Have tested the kernel 2.6.18-53.1.4.el5 from 3 different sources (my own, CentOS, and SciLinux). All worked fine. Also, others are confirming the fix in the upstream bugzilla. Arrfab, as the original reporter, would you agree that this bug report can be marked "Resolved" ? Akemi |
|
Ok, i confirm that it's solved by using 2.6.18-53.1.4.el5 ... Can an admin mark the bug as being resolved/fixed and close it ? |
|
Date Modified | Username | Field | Change |
---|---|---|---|
2007-11-17 15:01 | arrfab | New Issue | |
2007-11-17 15:01 | arrfab | Status | new => assigned |
2007-11-17 17:48 | toracat | Note Added: 0006336 | |
2007-11-20 10:01 | toracat | File Added: linux-2.6-autofs4-fix-race-between-mount-and-expire-2.patch | |
2007-11-20 10:13 | toracat | Note Added: 0006344 | |
2007-11-20 11:03 | toracat | Note Added: 0006345 | |
2007-11-20 19:01 | toracat | Status | assigned => acknowledged |
2007-11-24 17:43 | toracat | Note Added: 0006380 | |
2007-11-29 17:50 | toracat | Note Added: 0006423 | |
2007-11-29 20:38 | toracat | Note Added: 0006424 | |
2007-11-29 21:50 | toracat | Note Added: 0006425 | |
2007-12-01 19:52 | toracat | Note Added: 0006434 | |
2007-12-02 18:50 | arrfab | Note Added: 0006437 | |
2007-12-02 19:24 | kbsingh@karan.org | Status | acknowledged => resolved |
2007-12-02 19:24 | kbsingh@karan.org | Fixed in Version | => 5.1 |
2007-12-02 19:24 | kbsingh@karan.org | Resolution | open => fixed |