Annotation of 43BSDReno/usr.sbin/amd/doc/overview.tex, revision 1.1

1.1     ! root        1: % $Id: overview.tex,v 5.2 90/06/23 22:21:50 jsp Rel $
        !             2: %
        !             3: % Copyright (c) 1989 Jan-Simon Pendry
        !             4: % Copyright (c) 1989 Imperial College of Science, Technology & Medicine
        !             5: % Copyright (c) 1989 The Regents of the University of California.
        !             6: % All rights reserved.
        !             7: %
        !             8: % This code is derived from software contributed to Berkeley by
        !             9: % Jan-Simon Pendry at Imperial College, London.
        !            10: %
        !            11: % Redistribution and use in source and binary forms are permitted provided
        !            12: % that: (1) source distributions retain this entire copyright notice and
        !            13: % comment, and (2) distributions including binaries display the following
        !            14: % acknowledgement:  ``This product includes software developed by the
        !            15: % University of California, Berkeley and its contributors'' in the
        !            16: % documentation or other materials provided with the distribution and in
        !            17: % all advertising materials mentioning features or use of this software.
        !            18: % Neither the name of the University nor the names of its contributors may
        !            19: % be used to endorse or promote products derived from this software without
        !            20: % specific prior written permission.
        !            21: % THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
        !            22: % WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
        !            23: % MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
        !            24: %
        !            25: %      @(#)overview.tex        5.1 (Berkeley) 7/19/90
        !            26: 
        !            27: 
        !            28: \Chapter{Overview}
        !            29: \pagenumbering{arabic}
        !            30: 
        !            31: \Amd\ maintains a cache of mounted filesystems.  Filesystems are {\em demand-mounted}
        !            32: when they are first referenced, and unmounted after a period of inactivity.
        !            33: \Amd\ may be used as a replacement for Sun's {\bf automount}(8)
        !            34: \cite{usenix:automounter,sun:automount} program.
        !            35: It contains no proprietary source code and has been ported
        !            36: to numerous flavours of \Unix\ (see table \ref{table:os},~p\pageref{table:os}).
        !            37: 
        !            38: \Amd\ was designed as the basis for experimenting with filesystem
        !            39: layout and management.  Although \amd\ has many direct applications it
        !            40: is loaded with additional features which have little practical use.
        !            41: At some point the infrequently used components may be removed to
        !            42: streamline the production system.
        !            43: 
        !            44: %\Amd\ supports the notion of {\em replicated} filesystems by evaluating
        !            45: %each member of a list of possible filesystem locations in parallel.
        !            46: %\Amd\ checks that each cached mapping remains valid.  Should a mapping be
        !            47: %lost -- such as happens when a fileserver goes down -- \amd\ automatically
        !            48: %selects a replacement should one be available.
        !            49: 
        !            50: The fundamental concept behind \amd\ is the ability to separate the name used to refer to
        !            51: a file from the name used to refer to its physical storage location.
        !            52: This allows the same files to be accessed with the same name regardless of where
        !            53: in the network the name is used.  This is very different from placing
        !            54: {\tt /n/hostname} in front of the pathname since that includes location
        !            55: dependent information which may change if files are moved to another
        !            56: machine.
        !            57: By placing the required mappings in a centrally administered database,
        !            58: filesystems can be re-organised without requiring changes to password
        !            59: files, shell scripts and so on.
        !            60: 
        !            61: \Section{Filesystems and Volumes}
        !            62: \Amd\ views the world as a set of fileservers, each containg one or more filesystems
        !            63: where each filesystem contains one or more {\em volumes}.
        !            64: Here the term volume is used to refer to a coherent set of files such as a user's home directory or
        !            65: a \TeX\ distribution.
        !            66: 
        !            67: In order to access the contents of a volume, \amd\ must be told in which filesystem
        !            68: the volume resides and which host owns the filesystem.
        !            69: By default the host is assumed to be local and the volume is
        !            70: assumed to be the entire filesystem.
        !            71: If a filesystem contains more than one volume, then a {\em sublink} is used to
        !            72: refer to the sub-directory within the filesystem where the volume can be found.
        !            73: 
        !            74: \Section{Volume Naming}
        !            75: 
        !            76: Volume names are assumed to be unique across the entire network.
        !            77: A volume name is the pathname to the volume's root as known by the
        !            78: users of that volume.  Since this name uniquely identifies the volume contents,
        !            79: all volumes can be named and accessed from each host, subject to
        !            80: administrative controls.
        !            81: 
        !            82: Volumes may be replicated or duplicated.  Replicated volumes contain identical
        !            83: copies of the same data and reside at two or more locations in the network.
        !            84: Each of the replicated volumes can be used interchangeably.
        !            85: Duplicated volumes each have the same name but contain different, though
        !            86: functionally identical, data.  For example, {\tt /vol/tex} might be the
        !            87: name of a \TeX\ distribution which varied for each machine architecture.
        !            88: 
        !            89: \Amd\ provides facilities to take advantage of both replicated and
        !            90: duplicated volumes.  Configuration options allow a single set of configuration
        !            91: data to be shared across an entire network by taking advantage of replicated
        !            92: and duplicated volumes.
        !            93: 
        !            94: \Amd\ can take advantage of replacement volumes by mounting
        !            95: them as required should an active fileserver become unavailable.
        !            96: 
        !            97: \Section{Volume Binding}
        !            98: 
        !            99: \Unix\ implements a namespace of hierarchically mounted filesystems.
        !           100: Two forms of binding between names and files are provided.
        !           101: A {\em hard link} completes the binding when the name is added to the filesystem.
        !           102: A {\em soft link} delays the binding until the name is accessed.
        !           103: An {\em automounter} adds a further form in which the binding of name to
        !           104: filesystem is delayed until the name is accessed.
        !           105: 
        !           106: The target volume, in its general form, is a tuple (host, filesystem, sublink)
        !           107: which can be used to name the physical location of any volume in
        !           108: the network.
        !           109: 
        !           110: When a target is referenced, \amd\ ignores the sublink element and determines
        !           111: whether the required filesystem is already mounted.  This is done by computing
        !           112: the local mount point for the filesystem and checking for an existing filesystem
        !           113: mounted at the same place.  If such a filesystem already exists then it is
        !           114: assumed to be functionally identical to the target filesystem.  By default
        !           115: there is a one-to-one mapping between the pair (host, filesystem) and the local
        !           116: mount point so this assumption is valid.
        !           117: 
        !           118: \Section{Operational Principles}
        !           119: 
        !           120: \Amd\ operates by introducing new mount points into the namespace.
        !           121: The kernel sees these mount points as \NFS\ \cite{sun:nfs} filesystems being served by \amd.
        !           122: Having attached itself to the namespace, \amd\ is now able to control
        !           123: the view the rest of the system has of those mount points.
        !           124: RPC \cite{sun:rpc} calls are received from the kernel one at a time.
        !           125: 
        !           126: When a {\em lookup} call is received \amd\ checks whether the
        !           127: name is already known.  If it is not, the required volume is mounted.
        !           128: A symbolic link pointing to the volume root is then returned.
        !           129: Once the symbolic link is returned, the kernel will send all
        !           130: other requests direct to the mounted filesystem.
        !           131: 
        !           132: If a volume is not yet mounted, \amd\ consults a configuration
        !           133: {\em mount-map} corresponding to the automount point.
        !           134: \Amd\ then makes a runtime decision on what and where to mount
        !           135: a filesystem based on the information obtained from the map.
        !           136: 
        !           137: \Amd\ does not implement all the \NFS\ requests; only those
        !           138: relevant to name binding such as {\em lookup}, {\em readlink}
        !           139: and {\em readdir}.  Some other calls are also implemented
        !           140: but most simply return an error code; for example {\em mkdir}
        !           141: always returns ``Read-only filesystem''.
        !           142: 
        !           143: \Section{Mounting a Volume}
        !           144: 
        !           145: Each automount point has a mount map.  The mount map contains
        !           146: a list of key--value pairs.  The key is the name of the volume to
        !           147: be mounted.  The value is a list of locations describing where the
        !           148: filesystem is stored in the network.
        !           149: In the source for the map the value would look like
        !           150: \begin{quote}
        !           151: ${\em location}_1\ \ {\em location}_2\ \ \ldots\ \ {\em location}_n$
        !           152: \end{quote}
        !           153: 
        !           154: \Amd\ examines each location in turn.  Each location may contain {\em selectors}
        !           155: which control whether \amd\ can use that location.  For example, the location
        !           156: may be restricted to use by certain hosts.  Those locations which cannot be used
        !           157: are ignored.
        !           158: 
        !           159: \Amd\ attempts to mount the filesystem described by each remaining location
        !           160: until a mount succeeds or \amd\ can no longer proceed.
        !           161: The latter can occur in three ways:
        !           162: \begin{itemize}
        !           163: \item
        !           164: If none of
        !           165: the locations could be used, or if all of the locations caused an error,
        !           166: then the last error is returned.
        !           167: 
        !           168: \item
        !           169: If a location could be used but was being mounted in the background then \amd\ marks
        !           170: that mount as being ``in progress'' and continues with the next request; no reply
        !           171: is sent to the kernel.
        !           172: 
        !           173: \item
        !           174: Lastly, one or more of the mounts may have been {\em deferred}.
        !           175: A mount is deferred if extra information is required before the mount
        !           176: can proceed.  When the information becomes available the mount will
        !           177: take place, but in the mean time no reply is sent to the kernel.
        !           178: If the mount is deferred, \amd\ continues to try any remaining locations.
        !           179: \end{itemize}
        !           180: 
        !           181: %\Section{Task Scheduling}\label{task scheduler}
        !           182: %
        !           183: %\Amd\ provides a task scheduler to support its non-blocking semantics.
        !           184: %The basic operation of the scheduler is to call a procedure when
        !           185: %a particular event occurs.  A general sleep/wakeup mechanism is used
        !           186: %and sub-process support is built on that.  The scheduler maintains
        !           187: %two queues: one of blocked calls and one of callbacks waiting to
        !           188: %be made.
        !           189: %When a child process exits, its exit status is picked up by a signal
        !           190: %handler and a wakeup is issued on the internal job descriptor for that sub-process.
        !           191: %A timeout/untimeout mechanism provides for time dependent processing.
        !           192: 
        !           193: \Section{Automatic Unmounting}
        !           194: 
        !           195: To avoid an ever increasing number of filesystem mounts, \amd\ removes
        !           196: volume mappings which have not been used recently.  A time-to-live interval
        !           197: is associated with each mapping and when that expires the mapping is removed.
        !           198: When the last reference to a filesystem is removed, that filesystem is unmounted.
        !           199: If the unmount fails, for example the filesystem is still busy, the mapping
        !           200: is re-instated and its time-to-live interval is extended.
        !           201: The global default for this grace period is controlled by the ``-w'' command-line
        !           202: option (\see \Ref{opt:wait}).  It is also possible to set this value on a per-mount basis
        !           203: (\see \Ref{opt:utimeout}).
        !           204: 
        !           205: \Section{Keep-alives}\label{keepalives}
        !           206: 
        !           207: Use of some filesystem types requires the presence of a server on another machine.
        !           208: If a machine crashes then it is of no concern to processes on that machine
        !           209: that the filesystem is unavailable.  However, to processes on a remote host using
        !           210: that machine as a fileserver this event is important.  This situation is
        !           211: most widely recognised when an \NFS\ server crashes and the behaviour observed
        !           212: on client machines is that more and more processes hang.
        !           213: In order to provide the possibility of recovery, \amd\ implements a {\em keep-alive}
        !           214: interval timer for some filesystem types.
        !           215: Currently only \NFS\ makes use of this service.
        !           216: 
        !           217: The basis of the \NFS\ keep-alive implementation is the observation that
        !           218: most sites maintain replicated copies of common system data such as manual
        !           219: pages, most or all programs, system source code and so on.
        !           220: If one of those servers goes down it would be reasonable to mount one of
        !           221: the others as a replacement.
        !           222: 
        !           223: The first part of the process is to keep track of which fileservers are up and
        !           224: which are down.  \Amd\ does this by sending RPC requests to the servers'
        !           225: \NFS\ {\sc NullProc} and checking whether a reply is returned.
        !           226: While the server state is uncertain the requests are re-transmitted
        !           227: at three second intervals and if no reply is received after four attempts
        !           228: the server is marked down.  If a reply is received the fileserver is marked
        !           229: up and stays in that state for 30 seconds at which time another \NFS\ ping is sent.
        !           230: 
        !           231: Once a fileserver is marked down, requests continue to be sent every 30 seconds
        !           232: in order to determine when the fileserver comes back up.  During this time
        !           233: any reference through \amd\ to the filesystems on that server fail with the
        !           234: error ``Operation would block''.
        !           235: If a replacement volume is available then it will be mounted, otherwise
        !           236: the error is returned to the user.
        !           237: 
        !           238: %\Amd\ keeps track of which servers are up and which are down.
        !           239: %It does this by sending RPC requests to the servers' \NFS\ {\sc NullProc} and
        !           240: %checking whether a reply is returned.  If no replies are received after a
        !           241: %short period, \amd\ marks the fileserver {\em down}.
        !           242: %RPC requests continue to be sent so that it will notice when a fileserver
        !           243: %comes back up.
        !           244: %ICMP echo packets \cite{rfc:icmp} are not used because it is the availability
        !           245: %of the \NFS\ service that is important, not the existence of a base kernel.
        !           246: 
        !           247: %Whenever a reference to a fileserver which is down is made via \amd\, an alternate
        !           248: %filesystem is mounted if one is available.
        !           249: Although this action does not protect
        !           250: user files, which are unique on the network, or processes which do not access files
        !           251: via \amd\ or already have open files on the hung filesystem, it can prevent most new
        !           252: processes from hanging.
        !           253: 
        !           254: %With a suitable combination of filesystem management and mount-maps,
        !           255: %machines can be protected against most server downtime.  This can be
        !           256: %enhanced by allocating boot-servers dynamically which allows a diskless
        !           257: %workstation to be quickly restarted if necessary.  Once the root filesystem
        !           258: %is mounted, \amd\ can be started and allowed to mount the remainder of
        !           259: %the filesystem from whichever fileservers are available.
        !           260: 
        !           261: \Section{Non-blocking Operation}
        !           262: 
        !           263: Since there is only one instance of \amd\ for each automount point,
        !           264: and usually only one instance on each machine, it is important
        !           265: that it is always available to service kernel calls.
        !           266: \Amd\ goes to great lengths to ensure that it does not block in a system call.
        !           267: As a last resort \amd\ will fork before it attempts a system call that may block
        !           268: indefinitely, such as mounting an \NFS\ filesystem.
        !           269: Other tasks such as obtaining filehandle information for an \NFS\ filesystem,
        !           270: are done using a purpose built non-blocking RPC library which is integrated
        !           271: with \amd's task scheduler.% (\see \Ref{task scheduler}).
        !           272: This library is also used to implement \NFS\ keep-alives (\see \Ref{keepalives}).
        !           273: 
        !           274: Whenever a mount is deferred or backgrounded, \amd\ must wait for it to complete
        !           275: before replying to the kernel.  However, this would cause \amd\ to block waiting
        !           276: for a reply to be constructed.  Rather than do this, \amd\ simply {\em drops}
        !           277: the call under the assumption that the kernel RPC mechanism will automatically
        !           278: retry the request.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.