|
|
1.1 ! root 1: .\" @(#)p2 6.1 (Berkeley) 4/24/86 ! 2: .\" ! 3: .SH ! 4: 3.6 I/O calls ! 5: .PP ! 6: The system calls to do I/O are designed to eliminate ! 7: the differences between the various devices and styles of ! 8: access. ! 9: There is no distinction between ``random'' ! 10: and ``sequential'' I/O, nor is any logical record size imposed ! 11: by the system. ! 12: The size of an ordinary file is determined ! 13: by the number of bytes written on it; ! 14: no predetermination of the size of a file is necessary ! 15: or possible. ! 16: .PP ! 17: To illustrate the essentials of I/O, ! 18: some of the basic calls are ! 19: summarized below ! 20: in an anonymous language that will ! 21: indicate the required parameters without getting into the ! 22: underlying ! 23: complexities. ! 24: Each call to the system may potentially result in an error ! 25: return, which for simplicity is not represented ! 26: in the calling sequence. ! 27: .PP ! 28: To read or write a file assumed to exist already, it must ! 29: be opened by the following call: ! 30: .P1 ! 31: filep = open\|(\|name, flag\|) ! 32: .P2 ! 33: where ! 34: .UL name ! 35: indicates the name of the file. ! 36: An arbitrary path name may be given. ! 37: The ! 38: .UL flag ! 39: argument indicates whether the file is to be read, written, ! 40: or ``updated,'' that is, read and written simultaneously. ! 41: .PP ! 42: The returned value ! 43: .UL filep ! 44: is called a ! 45: .IT "file descriptor" . ! 46: It is a small integer used to identify the file ! 47: in subsequent calls to read, write, ! 48: or otherwise manipulate the file. ! 49: .PP ! 50: To create a new file or completely rewrite an old one, ! 51: there is a ! 52: .UL create ! 53: system call that ! 54: creates the given file if it does not exist, ! 55: or truncates it to zero length ! 56: if it does exist; ! 57: .UL create ! 58: also opens the new file for writing ! 59: and, like ! 60: .UL open , ! 61: returns a file descriptor. ! 62: .PP ! 63: The file system maintains no locks visible to the user, nor is there any ! 64: restriction on the number of users who may have a file ! 65: open for reading or writing. ! 66: Although it is possible for the contents of a file ! 67: to become scrambled when two users write on it simultaneously, ! 68: in practice difficulties do not arise. ! 69: We take the view that locks are neither ! 70: necessary nor sufficient, in our environment, ! 71: to prevent interference between users of the same file. ! 72: They are unnecessary because we are not ! 73: faced with large, single-file data bases ! 74: maintained by independent processes. ! 75: They are insufficient because ! 76: locks in the ordinary sense, whereby ! 77: one user is prevented from writing on a file that another ! 78: user is reading, ! 79: cannot prevent confusion ! 80: when, for example, both users are editing ! 81: a file with an editor that makes ! 82: a copy of the file being edited. ! 83: .PP ! 84: There are, however, ! 85: sufficient internal interlocks to maintain ! 86: the logical consistency of the file system ! 87: when two users engage simultaneously in ! 88: activities such as writing on ! 89: the same file, ! 90: creating files in the same directory, ! 91: or deleting each other's open files. ! 92: .PP ! 93: Except as indicated below, reading and writing ! 94: are sequential. ! 95: This means that if a particular ! 96: byte in the file was the last byte written (or read), ! 97: the next I/O call implicitly refers to the ! 98: immediately following byte. ! 99: For each open file there is a pointer, maintained ! 100: inside the system, ! 101: that indicates the next byte to be read ! 102: or written. ! 103: If ! 104: .IT n ! 105: bytes are read or written, the pointer advances ! 106: by ! 107: .IT n ! 108: bytes. ! 109: .PP ! 110: Once a file is open, the following calls ! 111: may be used: ! 112: .P1 ! 113: n = read\|(\|filep, buffer, count\|) ! 114: n = write\|(\|filep, buffer, count\|) ! 115: .P2 ! 116: Up to ! 117: .UL count ! 118: bytes are transmitted between the file specified ! 119: by ! 120: .UL filep ! 121: and the byte array ! 122: specified by ! 123: .UL buffer . ! 124: The returned value ! 125: .UL n ! 126: is the number of bytes actually transmitted. ! 127: In the ! 128: .UL write ! 129: case, ! 130: .UL n ! 131: is the same as ! 132: .UL count ! 133: except under exceptional conditions, such as I/O errors or ! 134: end of physical medium on special files; ! 135: in a ! 136: .UL read , ! 137: however, ! 138: .UL n ! 139: may without error be less than ! 140: .UL count . ! 141: If the read pointer is so near the end of the ! 142: file that reading ! 143: .UL count ! 144: characters ! 145: would cause reading beyond the end, only sufficient ! 146: bytes are transmitted to reach the end of the ! 147: file; ! 148: also, typewriter-like terminals ! 149: never return more than one line of input. ! 150: When a ! 151: .UL read ! 152: call returns with ! 153: .UL n ! 154: equal ! 155: to zero, the end of the file has been reached. ! 156: For disk files this occurs when the read pointer ! 157: becomes equal to the current ! 158: size of the file. ! 159: It is possible to generate an end-of-file ! 160: from a terminal by use of an escape ! 161: sequence that depends on the device used. ! 162: .PP ! 163: Bytes written affect only those parts of a file implied by ! 164: the position of the write pointer and the ! 165: count; no other part of the file ! 166: is changed. ! 167: If the last byte lies beyond the end of the file, the ! 168: file is made to grow as needed. ! 169: .PP ! 170: To do random (direct-access) I/O ! 171: it is only necessary to move the read or write pointer ! 172: to the appropriate location in the file. ! 173: .P1 ! 174: location = lseek\|(\|filep, offset, base\|) ! 175: .P2 ! 176: The pointer ! 177: associated with ! 178: .UL filep ! 179: is moved to a position ! 180: .UL offset ! 181: bytes from the beginning of the file, from the current position ! 182: of the pointer, or from the end of the file, ! 183: depending on ! 184: .UL base. ! 185: .UL \&offset ! 186: may be negative. ! 187: For some devices (e.g., paper ! 188: tape and ! 189: terminals) seek calls are ! 190: ignored. ! 191: The actual offset from the beginning of the file ! 192: to which the pointer was moved is returned ! 193: in ! 194: .UL location . ! 195: .PP ! 196: There are several additional system entries ! 197: having to do with I/O and with the file ! 198: system that will not be discussed. ! 199: For example: ! 200: close a file, ! 201: get the status of a file, ! 202: change the protection mode or the owner ! 203: of a file, ! 204: create a directory, ! 205: make a link to an existing file, ! 206: delete a file. ! 207: .SH ! 208: IV. IMPLEMENTATION OF THE FILE SYSTEM ! 209: .PP ! 210: As mentioned in Section 3.2 above, a directory entry contains ! 211: only a name for the associated file and a pointer to the ! 212: file itself. ! 213: This pointer is an integer called the ! 214: .IT i-number ! 215: (for index number) ! 216: of the file. ! 217: When the file is accessed, ! 218: its i-number is used as an index into ! 219: a system table (the ! 220: .IT i-list \|) ! 221: stored in a known ! 222: part of the device on which ! 223: the directory resides. ! 224: The entry found thereby (the file's ! 225: .IT i-node \|) ! 226: contains ! 227: the description of the file: ! 228: .IP i ! 229: the user and group-\*sID\*n of its owner ! 230: .IP ii ! 231: its protection bits ! 232: .IP iii ! 233: the physical disk or tape addresses for the file contents ! 234: .IP iv ! 235: its size ! 236: .IP v ! 237: time of creation, last use, and last modification ! 238: .IP vi ! 239: the number of links to the file, that is, the number of times it appears in a directory ! 240: .IP vii ! 241: a code indicating whether the file is a directory, an ordinary file, or a special file. ! 242: .LP ! 243: The purpose of an ! 244: .UL open ! 245: or ! 246: .UL create ! 247: system call is to turn the path name given by the user ! 248: into an i-number ! 249: by searching the explicitly or implicitly named directories. ! 250: Once a file is open, ! 251: its device, i-number, and read/write pointer are stored in a system table ! 252: indexed by the file descriptor returned by the ! 253: .UL open ! 254: or ! 255: .UL create . ! 256: Thus, during a subsequent ! 257: call to read or write the ! 258: file, ! 259: the descriptor ! 260: may be easily related to the information necessary to access the file. ! 261: .PP ! 262: When a new file is created, ! 263: an i-node is allocated for it and a directory entry is made ! 264: that contains the name of the file and the i-node ! 265: number. ! 266: Making a link to an existing file involves ! 267: creating a directory entry with the new name, ! 268: copying the i-number from the original file entry, ! 269: and incrementing the link-count field of the i-node. ! 270: Removing (deleting) a file is done by ! 271: decrementing the ! 272: link-count of the i-node specified by its directory entry ! 273: and erasing the directory entry. ! 274: If the link-count drops to 0, ! 275: any disk blocks in the file ! 276: are freed and the i-node is de-allocated. ! 277: .PP ! 278: The space on all disks that ! 279: contain a file system is divided into a number of ! 280: 512-byte ! 281: blocks logically addressed from 0 up to a limit that ! 282: depends on the device. ! 283: There is space in the i-node of each file for 13 device addresses. ! 284: For nonspecial files, ! 285: the first 10 device addresses point at the first ! 286: 10 blocks of the file. ! 287: If the file is larger than 10 blocks, ! 288: the 11 device address points to an ! 289: indirect block containing up to 128 addresses ! 290: of additional blocks in the file. ! 291: Still larger files use the twelfth device address ! 292: of the i-node to point to ! 293: a double-indirect block naming ! 294: 128 indirect blocks, ! 295: each ! 296: pointing to 128 blocks of the file. ! 297: If required, ! 298: the thirteenth device address is ! 299: a triple-indirect block. ! 300: Thus files may conceptually grow to ! 301: [\|(10+128+128\u\s62\s0\d+128\u\s63\s0\d)\*m512\|] bytes. ! 302: Once opened, ! 303: bytes numbered below 5120 can be read with a single ! 304: disk access; ! 305: bytes in the range 5120 to 70,656 ! 306: require two accesses; ! 307: bytes in the range 70,656 ! 308: to 8,459,264 ! 309: require three accesses; ! 310: bytes from there to the ! 311: largest file ! 312: (1,082,201,088) ! 313: require four accesses. ! 314: In practice, ! 315: a device cache mechanism ! 316: (see below) ! 317: proves effective in eliminating ! 318: most of the indirect fetches. ! 319: .PP ! 320: The foregoing discussion applies to ordinary files. ! 321: When an I/O request is made to a file whose i-node indicates that it ! 322: is special, ! 323: the last 12 device address words are immaterial, ! 324: and the first specifies ! 325: an internal ! 326: .IT "device name" , ! 327: which is interpreted as a pair of numbers ! 328: representing, ! 329: respectively, a device type ! 330: and subdevice number. ! 331: The device type indicates which ! 332: system routine will deal with I/O on that device; ! 333: the subdevice number selects, for example, a disk drive ! 334: attached to a particular controller or one of several ! 335: similar terminal interfaces. ! 336: .PP ! 337: In this environment, the implementation of the ! 338: .UL mount ! 339: system call (Section 3.4) is quite straightforward. ! 340: .UL \&mount ! 341: maintains a system table whose ! 342: argument is the i-number and device name of the ! 343: ordinary file specified ! 344: during the ! 345: .UL mount , ! 346: and whose corresponding value is the ! 347: device name of the indicated special file. ! 348: This table is searched for each i-number/device pair ! 349: that turns up while a path name is being scanned ! 350: during an ! 351: .UL open ! 352: or ! 353: .UL create ; ! 354: if a match is found, ! 355: the i-number is replaced by the i-number of the root ! 356: directory ! 357: and the device name is replaced by the table value. ! 358: .PP ! 359: To the user, both reading and writing of files appear to ! 360: be synchronous and unbuffered. ! 361: That is, immediately after ! 362: return from a ! 363: .UL read ! 364: call the data are available; conversely, ! 365: after a ! 366: .UL write ! 367: the user's workspace may be reused. ! 368: In fact, the system maintains a rather complicated ! 369: buffering mechanism that reduces greatly the number ! 370: of I/O operations required to access a file. ! 371: Suppose a ! 372: .UL write ! 373: call is made specifying transmission ! 374: of a single byte. ! 375: The system ! 376: will search its buffers to see ! 377: whether the affected disk block currently resides in main memory; ! 378: if not, it will be read in from the device. ! 379: Then the affected byte is replaced in the buffer and an ! 380: entry is made in a list of blocks to be written. ! 381: The return from the ! 382: .UL write ! 383: call may then take place, ! 384: although the actual I/O may not be completed until a later time. ! 385: Conversely, if a single byte is read, the system determines ! 386: whether the secondary storage block in which the byte is located is already ! 387: in one of the system's buffers; if so, the byte can be returned immediately. ! 388: If not, the block is read into a buffer and the byte picked out. ! 389: .PP ! 390: The system recognizes when ! 391: a program has ! 392: made accesses to ! 393: sequential blocks of a file, ! 394: and asynchronously ! 395: pre-reads the next block. ! 396: This significantly reduces ! 397: the running time of most programs ! 398: while adding little to ! 399: system overhead. ! 400: .PP ! 401: A program that reads or writes files in units of 512 bytes ! 402: has an advantage over a program that reads or writes ! 403: a single byte at a time, but the gain is not immense; ! 404: it comes mainly from the avoidance of system overhead. ! 405: If a program is used rarely or does ! 406: no great volume of I/O, it may quite reasonably ! 407: read and write in units as small as it wishes. ! 408: .PP ! 409: The notion of the i-list is an unusual feature ! 410: of ! 411: .UX . ! 412: In practice, this method of organizing the file system ! 413: has proved quite reliable and easy to deal with. ! 414: To the system itself, one of its strengths is ! 415: the fact that each file has a short, unambiguous name ! 416: related in a simple way to the protection, addressing, ! 417: and other information needed to access the file. ! 418: It also permits a quite simple and rapid ! 419: algorithm for checking the consistency of a file system, ! 420: for example, verification ! 421: that the portions of each device containing useful information ! 422: and those free to be allocated are disjoint and together ! 423: exhaust the space on the device. ! 424: This algorithm is independent ! 425: of the directory hierarchy, because it need only scan ! 426: the linearly organized i-list. ! 427: At the same time the notion of the i-list induces certain ! 428: peculiarities not found in other file system organizations. ! 429: For example, there is the question of who is to be charged ! 430: for the space a file occupies, ! 431: because all directory entries for a file have equal status. ! 432: Charging the owner of a file is unfair in general, ! 433: for one user may create a file, another may link to ! 434: it, and the first user may delete the file. ! 435: The first user is still the owner of the ! 436: file, but it should be charged ! 437: to the second user. ! 438: The simplest reasonably fair algorithm ! 439: seems to be to spread the charges ! 440: equally among users who have links to a file. ! 441: Many installations ! 442: avoid the ! 443: issue by not charging any fees at all.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.