|
|
1.1 ! root 1: % $Id: overview.tex,v 5.2 90/06/23 22:21:50 jsp Rel $ ! 2: % ! 3: % Copyright (c) 1989 Jan-Simon Pendry ! 4: % Copyright (c) 1989 Imperial College of Science, Technology & Medicine ! 5: % Copyright (c) 1989 The Regents of the University of California. ! 6: % All rights reserved. ! 7: % ! 8: % This code is derived from software contributed to Berkeley by ! 9: % Jan-Simon Pendry at Imperial College, London. ! 10: % ! 11: % Redistribution and use in source and binary forms are permitted provided ! 12: % that: (1) source distributions retain this entire copyright notice and ! 13: % comment, and (2) distributions including binaries display the following ! 14: % acknowledgement: ``This product includes software developed by the ! 15: % University of California, Berkeley and its contributors'' in the ! 16: % documentation or other materials provided with the distribution and in ! 17: % all advertising materials mentioning features or use of this software. ! 18: % Neither the name of the University nor the names of its contributors may ! 19: % be used to endorse or promote products derived from this software without ! 20: % specific prior written permission. ! 21: % THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED ! 22: % WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF ! 23: % MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. ! 24: % ! 25: % @(#)overview.tex 5.1 (Berkeley) 7/19/90 ! 26: ! 27: ! 28: \Chapter{Overview} ! 29: \pagenumbering{arabic} ! 30: ! 31: \Amd\ maintains a cache of mounted filesystems. Filesystems are {\em demand-mounted} ! 32: when they are first referenced, and unmounted after a period of inactivity. ! 33: \Amd\ may be used as a replacement for Sun's {\bf automount}(8) ! 34: \cite{usenix:automounter,sun:automount} program. ! 35: It contains no proprietary source code and has been ported ! 36: to numerous flavours of \Unix\ (see table \ref{table:os},~p\pageref{table:os}). ! 37: ! 38: \Amd\ was designed as the basis for experimenting with filesystem ! 39: layout and management. Although \amd\ has many direct applications it ! 40: is loaded with additional features which have little practical use. ! 41: At some point the infrequently used components may be removed to ! 42: streamline the production system. ! 43: ! 44: %\Amd\ supports the notion of {\em replicated} filesystems by evaluating ! 45: %each member of a list of possible filesystem locations in parallel. ! 46: %\Amd\ checks that each cached mapping remains valid. Should a mapping be ! 47: %lost -- such as happens when a fileserver goes down -- \amd\ automatically ! 48: %selects a replacement should one be available. ! 49: ! 50: The fundamental concept behind \amd\ is the ability to separate the name used to refer to ! 51: a file from the name used to refer to its physical storage location. ! 52: This allows the same files to be accessed with the same name regardless of where ! 53: in the network the name is used. This is very different from placing ! 54: {\tt /n/hostname} in front of the pathname since that includes location ! 55: dependent information which may change if files are moved to another ! 56: machine. ! 57: By placing the required mappings in a centrally administered database, ! 58: filesystems can be re-organised without requiring changes to password ! 59: files, shell scripts and so on. ! 60: ! 61: \Section{Filesystems and Volumes} ! 62: \Amd\ views the world as a set of fileservers, each containg one or more filesystems ! 63: where each filesystem contains one or more {\em volumes}. ! 64: Here the term volume is used to refer to a coherent set of files such as a user's home directory or ! 65: a \TeX\ distribution. ! 66: ! 67: In order to access the contents of a volume, \amd\ must be told in which filesystem ! 68: the volume resides and which host owns the filesystem. ! 69: By default the host is assumed to be local and the volume is ! 70: assumed to be the entire filesystem. ! 71: If a filesystem contains more than one volume, then a {\em sublink} is used to ! 72: refer to the sub-directory within the filesystem where the volume can be found. ! 73: ! 74: \Section{Volume Naming} ! 75: ! 76: Volume names are assumed to be unique across the entire network. ! 77: A volume name is the pathname to the volume's root as known by the ! 78: users of that volume. Since this name uniquely identifies the volume contents, ! 79: all volumes can be named and accessed from each host, subject to ! 80: administrative controls. ! 81: ! 82: Volumes may be replicated or duplicated. Replicated volumes contain identical ! 83: copies of the same data and reside at two or more locations in the network. ! 84: Each of the replicated volumes can be used interchangeably. ! 85: Duplicated volumes each have the same name but contain different, though ! 86: functionally identical, data. For example, {\tt /vol/tex} might be the ! 87: name of a \TeX\ distribution which varied for each machine architecture. ! 88: ! 89: \Amd\ provides facilities to take advantage of both replicated and ! 90: duplicated volumes. Configuration options allow a single set of configuration ! 91: data to be shared across an entire network by taking advantage of replicated ! 92: and duplicated volumes. ! 93: ! 94: \Amd\ can take advantage of replacement volumes by mounting ! 95: them as required should an active fileserver become unavailable. ! 96: ! 97: \Section{Volume Binding} ! 98: ! 99: \Unix\ implements a namespace of hierarchically mounted filesystems. ! 100: Two forms of binding between names and files are provided. ! 101: A {\em hard link} completes the binding when the name is added to the filesystem. ! 102: A {\em soft link} delays the binding until the name is accessed. ! 103: An {\em automounter} adds a further form in which the binding of name to ! 104: filesystem is delayed until the name is accessed. ! 105: ! 106: The target volume, in its general form, is a tuple (host, filesystem, sublink) ! 107: which can be used to name the physical location of any volume in ! 108: the network. ! 109: ! 110: When a target is referenced, \amd\ ignores the sublink element and determines ! 111: whether the required filesystem is already mounted. This is done by computing ! 112: the local mount point for the filesystem and checking for an existing filesystem ! 113: mounted at the same place. If such a filesystem already exists then it is ! 114: assumed to be functionally identical to the target filesystem. By default ! 115: there is a one-to-one mapping between the pair (host, filesystem) and the local ! 116: mount point so this assumption is valid. ! 117: ! 118: \Section{Operational Principles} ! 119: ! 120: \Amd\ operates by introducing new mount points into the namespace. ! 121: The kernel sees these mount points as \NFS\ \cite{sun:nfs} filesystems being served by \amd. ! 122: Having attached itself to the namespace, \amd\ is now able to control ! 123: the view the rest of the system has of those mount points. ! 124: RPC \cite{sun:rpc} calls are received from the kernel one at a time. ! 125: ! 126: When a {\em lookup} call is received \amd\ checks whether the ! 127: name is already known. If it is not, the required volume is mounted. ! 128: A symbolic link pointing to the volume root is then returned. ! 129: Once the symbolic link is returned, the kernel will send all ! 130: other requests direct to the mounted filesystem. ! 131: ! 132: If a volume is not yet mounted, \amd\ consults a configuration ! 133: {\em mount-map} corresponding to the automount point. ! 134: \Amd\ then makes a runtime decision on what and where to mount ! 135: a filesystem based on the information obtained from the map. ! 136: ! 137: \Amd\ does not implement all the \NFS\ requests; only those ! 138: relevant to name binding such as {\em lookup}, {\em readlink} ! 139: and {\em readdir}. Some other calls are also implemented ! 140: but most simply return an error code; for example {\em mkdir} ! 141: always returns ``Read-only filesystem''. ! 142: ! 143: \Section{Mounting a Volume} ! 144: ! 145: Each automount point has a mount map. The mount map contains ! 146: a list of key--value pairs. The key is the name of the volume to ! 147: be mounted. The value is a list of locations describing where the ! 148: filesystem is stored in the network. ! 149: In the source for the map the value would look like ! 150: \begin{quote} ! 151: ${\em location}_1\ \ {\em location}_2\ \ \ldots\ \ {\em location}_n$ ! 152: \end{quote} ! 153: ! 154: \Amd\ examines each location in turn. Each location may contain {\em selectors} ! 155: which control whether \amd\ can use that location. For example, the location ! 156: may be restricted to use by certain hosts. Those locations which cannot be used ! 157: are ignored. ! 158: ! 159: \Amd\ attempts to mount the filesystem described by each remaining location ! 160: until a mount succeeds or \amd\ can no longer proceed. ! 161: The latter can occur in three ways: ! 162: \begin{itemize} ! 163: \item ! 164: If none of ! 165: the locations could be used, or if all of the locations caused an error, ! 166: then the last error is returned. ! 167: ! 168: \item ! 169: If a location could be used but was being mounted in the background then \amd\ marks ! 170: that mount as being ``in progress'' and continues with the next request; no reply ! 171: is sent to the kernel. ! 172: ! 173: \item ! 174: Lastly, one or more of the mounts may have been {\em deferred}. ! 175: A mount is deferred if extra information is required before the mount ! 176: can proceed. When the information becomes available the mount will ! 177: take place, but in the mean time no reply is sent to the kernel. ! 178: If the mount is deferred, \amd\ continues to try any remaining locations. ! 179: \end{itemize} ! 180: ! 181: %\Section{Task Scheduling}\label{task scheduler} ! 182: % ! 183: %\Amd\ provides a task scheduler to support its non-blocking semantics. ! 184: %The basic operation of the scheduler is to call a procedure when ! 185: %a particular event occurs. A general sleep/wakeup mechanism is used ! 186: %and sub-process support is built on that. The scheduler maintains ! 187: %two queues: one of blocked calls and one of callbacks waiting to ! 188: %be made. ! 189: %When a child process exits, its exit status is picked up by a signal ! 190: %handler and a wakeup is issued on the internal job descriptor for that sub-process. ! 191: %A timeout/untimeout mechanism provides for time dependent processing. ! 192: ! 193: \Section{Automatic Unmounting} ! 194: ! 195: To avoid an ever increasing number of filesystem mounts, \amd\ removes ! 196: volume mappings which have not been used recently. A time-to-live interval ! 197: is associated with each mapping and when that expires the mapping is removed. ! 198: When the last reference to a filesystem is removed, that filesystem is unmounted. ! 199: If the unmount fails, for example the filesystem is still busy, the mapping ! 200: is re-instated and its time-to-live interval is extended. ! 201: The global default for this grace period is controlled by the ``-w'' command-line ! 202: option (\see \Ref{opt:wait}). It is also possible to set this value on a per-mount basis ! 203: (\see \Ref{opt:utimeout}). ! 204: ! 205: \Section{Keep-alives}\label{keepalives} ! 206: ! 207: Use of some filesystem types requires the presence of a server on another machine. ! 208: If a machine crashes then it is of no concern to processes on that machine ! 209: that the filesystem is unavailable. However, to processes on a remote host using ! 210: that machine as a fileserver this event is important. This situation is ! 211: most widely recognised when an \NFS\ server crashes and the behaviour observed ! 212: on client machines is that more and more processes hang. ! 213: In order to provide the possibility of recovery, \amd\ implements a {\em keep-alive} ! 214: interval timer for some filesystem types. ! 215: Currently only \NFS\ makes use of this service. ! 216: ! 217: The basis of the \NFS\ keep-alive implementation is the observation that ! 218: most sites maintain replicated copies of common system data such as manual ! 219: pages, most or all programs, system source code and so on. ! 220: If one of those servers goes down it would be reasonable to mount one of ! 221: the others as a replacement. ! 222: ! 223: The first part of the process is to keep track of which fileservers are up and ! 224: which are down. \Amd\ does this by sending RPC requests to the servers' ! 225: \NFS\ {\sc NullProc} and checking whether a reply is returned. ! 226: While the server state is uncertain the requests are re-transmitted ! 227: at three second intervals and if no reply is received after four attempts ! 228: the server is marked down. If a reply is received the fileserver is marked ! 229: up and stays in that state for 30 seconds at which time another \NFS\ ping is sent. ! 230: ! 231: Once a fileserver is marked down, requests continue to be sent every 30 seconds ! 232: in order to determine when the fileserver comes back up. During this time ! 233: any reference through \amd\ to the filesystems on that server fail with the ! 234: error ``Operation would block''. ! 235: If a replacement volume is available then it will be mounted, otherwise ! 236: the error is returned to the user. ! 237: ! 238: %\Amd\ keeps track of which servers are up and which are down. ! 239: %It does this by sending RPC requests to the servers' \NFS\ {\sc NullProc} and ! 240: %checking whether a reply is returned. If no replies are received after a ! 241: %short period, \amd\ marks the fileserver {\em down}. ! 242: %RPC requests continue to be sent so that it will notice when a fileserver ! 243: %comes back up. ! 244: %ICMP echo packets \cite{rfc:icmp} are not used because it is the availability ! 245: %of the \NFS\ service that is important, not the existence of a base kernel. ! 246: ! 247: %Whenever a reference to a fileserver which is down is made via \amd\, an alternate ! 248: %filesystem is mounted if one is available. ! 249: Although this action does not protect ! 250: user files, which are unique on the network, or processes which do not access files ! 251: via \amd\ or already have open files on the hung filesystem, it can prevent most new ! 252: processes from hanging. ! 253: ! 254: %With a suitable combination of filesystem management and mount-maps, ! 255: %machines can be protected against most server downtime. This can be ! 256: %enhanced by allocating boot-servers dynamically which allows a diskless ! 257: %workstation to be quickly restarted if necessary. Once the root filesystem ! 258: %is mounted, \amd\ can be started and allowed to mount the remainder of ! 259: %the filesystem from whichever fileservers are available. ! 260: ! 261: \Section{Non-blocking Operation} ! 262: ! 263: Since there is only one instance of \amd\ for each automount point, ! 264: and usually only one instance on each machine, it is important ! 265: that it is always available to service kernel calls. ! 266: \Amd\ goes to great lengths to ensure that it does not block in a system call. ! 267: As a last resort \amd\ will fork before it attempts a system call that may block ! 268: indefinitely, such as mounting an \NFS\ filesystem. ! 269: Other tasks such as obtaining filehandle information for an \NFS\ filesystem, ! 270: are done using a purpose built non-blocking RPC library which is integrated ! 271: with \amd's task scheduler.% (\see \Ref{task scheduler}). ! 272: This library is also used to implement \NFS\ keep-alives (\see \Ref{keepalives}). ! 273: ! 274: Whenever a mount is deferred or backgrounded, \amd\ must wait for it to complete ! 275: before replying to the kernel. However, this would cause \amd\ to block waiting ! 276: for a reply to be constructed. Rather than do this, \amd\ simply {\em drops} ! 277: the call under the assumption that the kernel RPC mechanism will automatically ! 278: retry the request.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.