[NCLUG] autofs behaviour

Mon Jul 14 16:57:19 MDT 2008

On Jul 11, 2008, at 9:29 AM, Michael Coffman wrote:
> I have a question about the behaviour of autofs.  It seems like it is
> very easy to hit a situation where autofs will unmount a file system
> and an active process will lose access to a file and fail to read it.
> This seems to happen with both direct and indirect mapping.
>
> Specifics about what I am running:
>
>   autofs-4.1.3-199.3
>   kernel - 2.6.9-55.ELsmp
>   arch - x86_64
>   OS - Red Hat Enterprise Linux WS release 4 (Nahant Update 5
>
> autofs is configured to use the defaults and has host mapping
> enabled.  When I run the following simple test:
>
>  date;while [[ -r /net/hostname/test ]]; do :; done; date
>
> command output:
>  Fri Jul 11 09:02:46 MDT 2008
>  Fri Jul 11 09:03:59 MDT 2008
> command output run 2:
>  Fri Jul 11 09:11:42 MDT 2008
>  Fri Jul 11 09:12:45 MDT 2008
>
> As you can see, I lose read access typically shortly after the mount  
> is supposed to time out.
>
> I believe that after the timeout, autofs tries to unmount file  
> systems and if they are busy, resets the timeout counter and  
> remounts any unmounted mount points if it is a host map.
>
> But in general, I would think that I should never lose read access  
> to the
> file.  Shouldn't autofs just remount it if it has been unmounted?
>
> Anyone have any insight?
>
> BTW: am-utils (amd) does not behave this way.  I am guessing this
>     has to do with the fact that it runs in user space.
>
> -- 
> -MichaelC

I'm going to cross-post something from a LOPSA mailing list that may  
be relevant to this  - at the least I found it interesting so I'm  
going to share it here.  Short form is that the interaction of autofs  
and soft mounts creates some quirky behavior that could possibly  
explain what you are seeing.

Neil Neely
http://neil-neely.blogspot.com

Discussion from lopsa-tech mailing list follows:

On Jul 14, 2008, at 4:43 PM, rackow at mcs.anl.gov wrote:
> How are you doing your mounts?  Are they hard or soft mounts?  From
> your description, I presume it's a soft mount. This tends to cause
> these kinds of problems.  With a hard mount, you get a return of
> true/false on the existance of a file.  With a soft-mount,
> you get true/false/wait-a-bit.  I know these are not the right
> terms, but you get the idea.  The problem is that most tools do
> not understand that 3rd return state, so treat is as a false/failure
> and strange things happen.  When you retry, the mount has now
> completed, so you are back to the true/false realm of states.
>
> I've run into more problems with "soft-mounts" because of this
> that I'll never use them on a system I manage.  While I can
> see some value in not hanging a system waiting for a mount to
> complete, it really doesn't overcome the problems it causes
> with things like this.
>
> -_Gene
>
>
> "Derek J. Balling" made the following keystrokes:
>> I've been seeing this phenomenon for a while now, but can't figure  
>> out
>> what setting I need to tweak to make it not be a problem.
>>
>> We're using autofs/NFS for our home directories on CentOS (RHEL) 5.1.
>> If I SSH into my home directory (using SSH keys), INITIALLY the key-
>> use will fail and it will fall back to my password. But then that
>> attempt to read my keyfile has triggered the NFS mount, and any
>> subsequent attempts will succeed.
>>
>> Likewise if we try to run shell scripts that are in /mnt/scripts/
>> foo.sh... the first time we issue the command we'll get a file not
>> found, and then if we simply hit up-arrow and do it again, it works
>> fine.
>>
>> It's especially problematic in scripts. I've taken to doing stupid
>> hacks like:
>>
>>  ls /mnt/scripts 2>/dev/null 1>/dev/null ; /mnt/scripts/
>> scriptName.sh arg1 arg2
>>
>> Just to force the autofs to "fail" on my ls and then have it around
>> for the script that has to get called.
>>
>> Is there some setting somewhere I need to tweak? Has anyone seen this
>> before?
>>
>> cheers,
>> D