Annotation of 43BSDReno/share/doc/ps2/03.uprog/p4, revision 1.1

1.1     ! root        1: .\"    @(#)p4  6.2 (Berkeley) 5/9/86
        !             2: .\"
        !             3: .NH
        !             4: LOW-LEVEL I/O
        !             5: .PP
        !             6: This section describes the 
        !             7: bottom level of I/O on the
        !             8: .UC UNIX
        !             9: system.
        !            10: The lowest level of I/O in
        !            11: .UC UNIX
        !            12: provides no buffering or any other services;
        !            13: it is in fact a direct entry into the operating system.
        !            14: You are entirely on your own,
        !            15: but on the other hand,
        !            16: you have the most control over what happens.
        !            17: And since the calls and usage are quite simple,
        !            18: this isn't as bad as it sounds.
        !            19: .NH 2
        !            20: File Descriptors
        !            21: .PP
        !            22: In the
        !            23: .UC UNIX
        !            24: operating system,
        !            25: all input and output is done
        !            26: by reading or writing files,
        !            27: because all peripheral devices, even the user's terminal,
        !            28: are files in the file system.
        !            29: This means that a single, homogeneous interface
        !            30: handles all communication between a program and peripheral devices.
        !            31: .PP
        !            32: In the most general case,
        !            33: before reading or writing a file,
        !            34: it is necessary to inform the system
        !            35: of your intent to do so,
        !            36: a process called
        !            37: ``opening'' the file.
        !            38: If you are going to write on a file,
        !            39: it may also be necessary to create it.
        !            40: The system checks your right to do so
        !            41: (Does the file exist?
        !            42: Do you have permission to access it?),
        !            43: and if all is well,
        !            44: returns a small positive integer
        !            45: called a
        !            46: .ul
        !            47: file descriptor.
        !            48: Whenever I/O is to be done on the file,
        !            49: the file descriptor is used instead of the name to identify the file.
        !            50: (This is roughly analogous to the use of
        !            51: .UC READ(5,...)
        !            52: and
        !            53: .UC WRITE(6,...)
        !            54: in Fortran.)
        !            55: All
        !            56: information about an open file is maintained by the system;
        !            57: the user program refers to the file
        !            58: only
        !            59: by the file descriptor.
        !            60: .PP
        !            61: The file pointers discussed in section 3
        !            62: are similar in spirit to file descriptors,
        !            63: but file descriptors are more fundamental.
        !            64: A file pointer is a pointer to a structure that contains,
        !            65: among other things, the file descriptor for the file in question.
        !            66: .PP
        !            67: Since input and output involving the user's terminal
        !            68: are so common,
        !            69: special arrangements exist to make this convenient.
        !            70: When the command interpreter (the
        !            71: ``shell'')
        !            72: runs a program,
        !            73: it opens
        !            74: three files, with file descriptors 0, 1, and 2,
        !            75: called the standard input,
        !            76: the standard output, and the standard error output.
        !            77: All of these are normally connected to the terminal,
        !            78: so if a program reads file descriptor 0
        !            79: and writes file descriptors 1 and 2,
        !            80: it can do terminal I/O
        !            81: without worrying about opening the files.
        !            82: .PP
        !            83: If I/O is redirected 
        !            84: to and from files with
        !            85: .UL < 
        !            86: and
        !            87: .UL > ,
        !            88: as in
        !            89: .P1
        !            90: prog <infile >outfile
        !            91: .P2
        !            92: the shell changes the default assignments for file descriptors
        !            93: 0 and 1
        !            94: from the terminal to the named files.
        !            95: Similar observations hold if the input or output is associated with a pipe.
        !            96: Normally file descriptor 2 remains attached to the terminal,
        !            97: so error messages can go there.
        !            98: In all cases,
        !            99: the file assignments are changed by the shell,
        !           100: not by the program.
        !           101: The program does not need to know where its input
        !           102: comes from nor where its output goes,
        !           103: so long as it uses file 0 for input and 1 and 2 for output.
        !           104: .NH 2
        !           105: Read and Write
        !           106: .PP
        !           107: All input and output is done by
        !           108: two functions called
        !           109: .UL read
        !           110: and
        !           111: .UL write .
        !           112: For both, the first argument is a file descriptor.
        !           113: The second argument is a buffer in your program where the data is to
        !           114: come from or go to.
        !           115: The third argument is the number of bytes to be transferred.
        !           116: The calls are
        !           117: .P1
        !           118: n_read = read(fd, buf, n);
        !           119: 
        !           120: n_written = write(fd, buf, n);
        !           121: .P2
        !           122: Each call returns a byte count
        !           123: which is the number of bytes actually transferred.
        !           124: On reading,
        !           125: the number of bytes returned may be less than
        !           126: the number asked for,
        !           127: because fewer than
        !           128: .UL n
        !           129: bytes remained to be read.
        !           130: (When the file is a terminal,
        !           131: .UL read
        !           132: normally reads only up to the next newline,
        !           133: which is generally less than what was requested.)
        !           134: A return value of zero bytes implies end of file,
        !           135: and
        !           136: .UL -1
        !           137: indicates an error of some sort.
        !           138: For writing, the returned value is the number of bytes
        !           139: actually written;
        !           140: it is generally an error if this isn't equal
        !           141: to the number supposed to be written.
        !           142: .PP
        !           143: The number of bytes to be read or written is quite arbitrary.
        !           144: The two most common values are 
        !           145: 1,
        !           146: which means one character at a time
        !           147: (``unbuffered''),
        !           148: and
        !           149: 512,
        !           150: which corresponds to a physical blocksize on many peripheral devices.
        !           151: This latter size will be most efficient,
        !           152: but even character at a time I/O
        !           153: is not inordinately expensive.
        !           154: .PP
        !           155: Putting these facts together,
        !           156: we can write a simple program to copy
        !           157: its input to its output.
        !           158: This program will copy anything to anything,
        !           159: since the input and output can be redirected to any file or device.
        !           160: .P1
        !           161: #define        BUFSIZE 512     /* best size for PDP-11 UNIX */
        !           162: 
        !           163: main() /* copy input to output */
        !           164: {
        !           165:        char    buf[BUFSIZE];
        !           166:        int     n;
        !           167: 
        !           168:        while ((n = read(0, buf, BUFSIZE)) > 0)
        !           169:                write(1, buf, n);
        !           170:        exit(0);
        !           171: }
        !           172: .P2
        !           173: If the file size is not a multiple of
        !           174: .UL BUFSIZE ,
        !           175: some 
        !           176: .UL read
        !           177: will return a smaller number of bytes
        !           178: to be written by
        !           179: .UL write ;
        !           180: the next call to 
        !           181: .UL read
        !           182: after that
        !           183: will return zero.
        !           184: .PP
        !           185: It is instructive to see how
        !           186: .UL read
        !           187: and
        !           188: .UL write
        !           189: can be used to construct
        !           190: higher level routines like
        !           191: .UL getchar ,
        !           192: .UL putchar ,
        !           193: etc.
        !           194: For example,
        !           195: here is a version of
        !           196: .UL getchar
        !           197: which does unbuffered input.
        !           198: .P1
        !           199: #define        CMASK   0377    /* for making char's > 0 */
        !           200: 
        !           201: getchar()      /* unbuffered single character input */
        !           202: {
        !           203:        char c;
        !           204: 
        !           205:        return((read(0, &c, 1) > 0) ? c & CMASK : EOF);
        !           206: }
        !           207: .P2
        !           208: .UL c
        !           209: .ul
        !           210: must
        !           211: be declared
        !           212: .UL char ,
        !           213: because
        !           214: .UL read
        !           215: accepts a character pointer.
        !           216: The character being returned must be masked with
        !           217: .UL 0377
        !           218: to ensure that it is positive;
        !           219: otherwise sign extension may make it negative.
        !           220: (The constant
        !           221: .UL 0377
        !           222: is appropriate for the
        !           223: .UC PDP -11
        !           224: but not necessarily for other machines.)
        !           225: .PP
        !           226: The second version of
        !           227: .UL getchar
        !           228: does input in big chunks,
        !           229: and hands out the characters one at a time.
        !           230: .P1
        !           231: #define        CMASK   0377    /* for making char's > 0 */
        !           232: #define        BUFSIZE 512
        !           233: 
        !           234: getchar()      /* buffered version */
        !           235: {
        !           236:        static char     buf[BUFSIZE];
        !           237:        static char     *bufp = buf;
        !           238:        static int      n = 0;
        !           239: 
        !           240:        if (n == 0) {   /* buffer is empty */
        !           241:                n = read(0, buf, BUFSIZE);
        !           242:                bufp = buf;
        !           243:        }
        !           244:        return((--n >= 0) ? *bufp++ & CMASK : EOF);
        !           245: }
        !           246: .P2
        !           247: .NH 2
        !           248: Open, Creat, Close, Unlink
        !           249: .PP
        !           250: Other than the default
        !           251: standard input, output and error files,
        !           252: you must explicitly open files in order to
        !           253: read or write them.
        !           254: There are two system entry points for this,
        !           255: .UL open
        !           256: and
        !           257: .UL creat 
        !           258: [sic].
        !           259: .PP
        !           260: .UL open
        !           261: is rather like the
        !           262: .UL  fopen
        !           263: discussed in the previous section,
        !           264: except that instead of returning a file pointer,
        !           265: it returns a file descriptor,
        !           266: which is just an
        !           267: .UL int .
        !           268: .P1
        !           269: int fd;
        !           270: 
        !           271: fd = open(name, rwmode);
        !           272: .P2
        !           273: As with
        !           274: .UL fopen ,
        !           275: the
        !           276: .UL name
        !           277: argument
        !           278: is a character string corresponding to the external file name.
        !           279: The access mode argument
        !           280: is different, however:
        !           281: .UL rwmode
        !           282: is 0 for read, 1 for write, and 2 for read and write access.
        !           283: .UL open
        !           284: returns
        !           285: .UL -1
        !           286: if any error occurs;
        !           287: otherwise it returns a valid file descriptor.
        !           288: .PP
        !           289: It is an error to 
        !           290: try to
        !           291: .UL open
        !           292: a file that does not exist.
        !           293: The entry point
        !           294: .UL creat
        !           295: is provided to create new files,
        !           296: or to re-write old ones.
        !           297: .P1
        !           298: fd = creat(name, pmode);
        !           299: .P2
        !           300: returns a file descriptor
        !           301: if it was able to create the file
        !           302: called
        !           303: .UL name ,
        !           304: and
        !           305: .UL -1
        !           306: if not.
        !           307: If the file
        !           308: already exists,
        !           309: .UL creat
        !           310: will truncate it to zero length;
        !           311: it is not an error to
        !           312: .UL creat
        !           313: a file that already exists.
        !           314: .PP
        !           315: If the file is brand new,
        !           316: .UL creat
        !           317: creates it with the
        !           318: .ul
        !           319: protection mode 
        !           320: specified by
        !           321: the
        !           322: .UL pmode
        !           323: argument.
        !           324: In the
        !           325: .UC UNIX
        !           326: file system,
        !           327: there are nine bits of protection information
        !           328: associated with a file,
        !           329: controlling read, write and execute permission for
        !           330: the owner of the file,
        !           331: for the owner's group,
        !           332: and for all others.
        !           333: Thus a three-digit octal number
        !           334: is most convenient for specifying the permissions.
        !           335: For example,
        !           336: 0755
        !           337: specifies read, write and execute permission for the owner,
        !           338: and read and execute permission for the group and everyone else.
        !           339: .PP
        !           340: To illustrate,
        !           341: here is a simplified version of
        !           342: the
        !           343: .UC UNIX
        !           344: utility
        !           345: .IT cp ,
        !           346: a program which copies one file to another.
        !           347: (The main simplification is that our version
        !           348: copies only one file,
        !           349: and does not permit the second argument
        !           350: to be a directory.)
        !           351: .P1
        !           352: #define NULL 0
        !           353: #define BUFSIZE 512
        !           354: #define PMODE 0644 /* RW for owner, R for group, others */
        !           355: 
        !           356: main(argc, argv)       /* cp: copy f1 to f2 */
        !           357: int argc;
        !           358: char *argv[];
        !           359: {
        !           360:        int     f1, f2, n;
        !           361:        char    buf[BUFSIZE];
        !           362: 
        !           363:        if (argc != 3)
        !           364:                error("Usage: cp from to", NULL);
        !           365:        if ((f1 = open(argv[1], 0)) == -1)
        !           366:                error("cp: can't open %s", argv[1]);
        !           367:        if ((f2 = creat(argv[2], PMODE)) == -1)
        !           368:                error("cp: can't create %s", argv[2]);
        !           369: 
        !           370:        while ((n = read(f1, buf, BUFSIZE)) > 0)
        !           371:                if (write(f2, buf, n) != n)
        !           372:                        error("cp: write error", NULL);
        !           373:        exit(0);
        !           374: }
        !           375: .P2
        !           376: .P1
        !           377: error(s1, s2)  /* print error message and die */
        !           378: char *s1, *s2;
        !           379: {
        !           380:        printf(s1, s2);
        !           381:        printf("\en");
        !           382:        exit(1);
        !           383: }
        !           384: .P2
        !           385: .PP
        !           386: As we said earlier,
        !           387: there is a limit (typically 15-25)
        !           388: on the number of files which a program
        !           389: may have open simultaneously.
        !           390: Accordingly, any program which intends to process
        !           391: many files must be prepared to re-use
        !           392: file descriptors.
        !           393: The routine
        !           394: .UL close
        !           395: breaks the connection between a file descriptor
        !           396: and an open file,
        !           397: and frees the
        !           398: file descriptor for use with some other file.
        !           399: Termination of a program
        !           400: via
        !           401: .UL exit
        !           402: or return from the main program closes all open files.
        !           403: .PP
        !           404: The function
        !           405: .UL unlink(filename)
        !           406: removes the file
        !           407: .UL filename
        !           408: from the file system.
        !           409: .NH 2
        !           410: Random Access \(em Seek and Lseek
        !           411: .PP
        !           412: File I/O is normally sequential:
        !           413: each
        !           414: .UL read
        !           415: or
        !           416: .UL write
        !           417: takes place at a position in the file
        !           418: right after the previous one.
        !           419: When necessary, however,
        !           420: a file can be read or written in any arbitrary order.
        !           421: The
        !           422: system call
        !           423: .UL lseek
        !           424: provides a way to move around in
        !           425: a file without actually reading
        !           426: or writing:
        !           427: .P1
        !           428: lseek(fd, offset, origin);
        !           429: .P2
        !           430: forces the current position in the file
        !           431: whose descriptor is
        !           432: .UL fd
        !           433: to move to position
        !           434: .UL offset ,
        !           435: which is taken relative to the location
        !           436: specified by
        !           437: .UL origin .
        !           438: Subsequent reading or writing will begin at that position.
        !           439: .UL offset
        !           440: is
        !           441: a
        !           442: .UL long ;
        !           443: .UL fd
        !           444: and
        !           445: .UL origin
        !           446: are
        !           447: .UL int 's.
        !           448: .UL origin
        !           449: can be 0, 1, or 2 to specify that 
        !           450: .UL offset
        !           451: is to be
        !           452: measured from
        !           453: the beginning, from the current position, or from the
        !           454: end of the file respectively.
        !           455: For example,
        !           456: to append to a file,
        !           457: seek to the end before writing:
        !           458: .P1
        !           459: lseek(fd, 0L, 2);
        !           460: .P2
        !           461: To get back to the beginning (``rewind''),
        !           462: .P1
        !           463: lseek(fd, 0L, 0);
        !           464: .P2
        !           465: Notice the
        !           466: .UL 0L
        !           467: argument;
        !           468: it could also be written as
        !           469: .UL (long)\ 0 .
        !           470: .PP
        !           471: With 
        !           472: .UL lseek ,
        !           473: it is possible to treat files more or less like large arrays,
        !           474: at the price of slower access.
        !           475: For example, the following simple function reads any number of bytes
        !           476: from any arbitrary place in a file.
        !           477: .P1
        !           478: get(fd, pos, buf, n) /* read n bytes from position pos */
        !           479: int fd, n;
        !           480: long pos;
        !           481: char *buf;
        !           482: {
        !           483:        lseek(fd, pos, 0);      /* get to pos */
        !           484:        return(read(fd, buf, n));
        !           485: }
        !           486: .P2
        !           487: .PP
        !           488: In pre-version 7
        !           489: .UC UNIX ,
        !           490: the basic entry point to the I/O system
        !           491: is called
        !           492: .UL seek .
        !           493: .UL seek
        !           494: is identical to
        !           495: .UL lseek ,
        !           496: except that its
        !           497: .UL  offset 
        !           498: argument is an
        !           499: .UL int
        !           500: rather than  a
        !           501: .UL long .
        !           502: Accordingly,
        !           503: since
        !           504: .UC PDP -11
        !           505: integers have only 16 bits,
        !           506: the
        !           507: .UL offset
        !           508: specified
        !           509: for
        !           510: .UL seek
        !           511: is limited to 65,535;
        !           512: for this reason,
        !           513: .UL origin
        !           514: values of 3, 4, 5 cause
        !           515: .UL seek
        !           516: to multiply the given offset by 512
        !           517: (the number of bytes in one physical block)
        !           518: and then interpret
        !           519: .UL origin
        !           520: as if it were 0, 1, or 2 respectively.
        !           521: Thus to get to an arbitrary place in a large file
        !           522: requires two seeks, first one which selects
        !           523: the block, then one which
        !           524: has
        !           525: .UL origin
        !           526: equal to 1 and moves to the desired byte within the block.
        !           527: .NH 2
        !           528: Error Processing
        !           529: .PP
        !           530: The routines discussed in this section,
        !           531: and in fact all the routines which are direct entries into the system
        !           532: can incur errors.
        !           533: Usually they indicate an error by returning a value of \-1.
        !           534: Sometimes it is nice to know what sort of error occurred;
        !           535: for this purpose all these routines, when appropriate,
        !           536: leave an error number in the external cell
        !           537: .UL errno .
        !           538: The meanings of the various error numbers are
        !           539: listed
        !           540: in the introduction to Section II
        !           541: of the
        !           542: .I
        !           543: .UC UNIX
        !           544: Programmer's Manual,
        !           545: .R
        !           546: so your program can, for example, determine if
        !           547: an attempt to open a file failed because it did not exist
        !           548: or because the user lacked permission to read it.
        !           549: Perhaps more commonly,
        !           550: you may want to print out the
        !           551: reason for failure.
        !           552: The routine
        !           553: .UL perror
        !           554: will print a message associated with the value
        !           555: of
        !           556: .UL errno ;
        !           557: more generally,
        !           558: .UL sys\_errno
        !           559: is an array of character strings which can be indexed
        !           560: by
        !           561: .UL errno
        !           562: and printed by your program.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.