Annotation of 43BSDReno/share/doc/ps1/08.ipc/5.t, revision 1.1.1.1

1.1       root        1: .\" Copyright (c) 1986 The Regents of the University of California.
                      2: .\" All rights reserved.
                      3: .\"
                      4: .\" Redistribution and use in source and binary forms are permitted
                      5: .\" provided that the above copyright notice and this paragraph are
                      6: .\" duplicated in all such forms and that any documentation,
                      7: .\" advertising materials, and other materials related to such
                      8: .\" distribution and use acknowledge that the software was developed
                      9: .\" by the University of California, Berkeley.  The name of the
                     10: .\" University may not be used to endorse or promote products derived
                     11: .\" from this software without specific prior written permission.
                     12: .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
                     13: .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
                     14: .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
                     15: .\"
                     16: .\"    @(#)5.t 1.6 (Berkeley) 3/7/89
                     17: .\"
                     18: .\".ds RH "Advanced Topics
                     19: .bp
                     20: .nr H1 5
                     21: .nr H2 0
                     22: .LG
                     23: .B
                     24: .ce
                     25: 5. ADVANCED TOPICS
                     26: .sp 2
                     27: .R
                     28: .NL
                     29: .PP
                     30: A number of facilities have yet to be discussed.  For most users
                     31: of the IPC the mechanisms already
                     32: described will suffice in constructing distributed
                     33: applications.  However, others will find the need to utilize some
                     34: of the features which we consider in this section.
                     35: .NH 2
                     36: Out of band data
                     37: .PP
                     38: The stream socket abstraction includes the notion of \*(lqout
                     39: of band\*(rq data.  Out of band data is a logically independent 
                     40: transmission channel associated with each pair of connected
                     41: stream sockets.  Out of band data is delivered to the user
                     42: independently of normal data.
                     43: The abstraction defines that the out of band data facilities
                     44: must support the reliable delivery of at least one
                     45: out of band message at a time.  This message may contain at least one
                     46: byte of data, and at least one message may be pending delivery
                     47: to the user at any one time.  For communications protocols which
                     48: support only in-band signaling (i.e. the urgent data is
                     49: delivered in sequence with the normal data), the system normally extracts
                     50: the data from the normal data stream and stores it separately.
                     51: This allows users to choose between receiving the urgent data
                     52: in order and receiving it out of sequence without having to
                     53: buffer all the intervening data.  It is possible
                     54: to ``peek'' (via MSG_PEEK) at out of band data.
                     55: If the socket has a process group, a SIGURG signal is generated
                     56: when the protocol is notified of its existence.
                     57: A process can set the process group
                     58: or process id to be informed by the SIGURG signal via the
                     59: appropriate \fIfcntl\fP call, as described below for
                     60: SIGIO.
                     61: If multiple sockets may have out of band data awaiting
                     62: delivery, a \fIselect\fP call for exceptional conditions
                     63: may be used to determine those sockets with such data pending.
                     64: Neither the signal nor the select indicate the actual arrival
                     65: of the out-of-band data, but only notification that it is pending.
                     66: .PP
                     67: In addition to the information passed, a logical mark is placed in
                     68: the data stream to indicate the point at which the out
                     69: of band data was sent.  The remote login and remote shell
                     70: applications use this facility to propagate signals between
                     71: client and server processes.  When a signal
                     72: flushs any pending output from the remote process(es), all
                     73: data up to the mark in the data stream is discarded.
                     74: .PP
                     75: To send an out of band message the MSG_OOB flag is supplied to
                     76: a \fIsend\fP or \fIsendto\fP calls,
                     77: while to receive out of band data MSG_OOB should be indicated
                     78: when performing a \fIrecvfrom\fP or \fIrecv\fP call.
                     79: To find out if the read pointer is currently pointing at
                     80: the mark in the data stream, the SIOCATMARK ioctl is provided:
                     81: .DS
                     82: ioctl(s, SIOCATMARK, &yes);
                     83: .DE
                     84: If \fIyes\fP is a 1 on return, the next read will return data
                     85: after the mark.  Otherwise (assuming out of band data has arrived), 
                     86: the next read will provide data sent by the client prior
                     87: to transmission of the out of band signal.  The routine used
                     88: in the remote login process to flush output on receipt of an
                     89: interrupt or quit signal is shown in Figure 5.
                     90: It reads the normal data up to the mark (to discard it),
                     91: then reads the out-of-band byte.
                     92: .KF
                     93: .DS
                     94: #include <sys/ioctl.h>
                     95: #include <sys/file.h>
                     96:  ...
                     97: oob()
                     98: {
                     99:        int out = FWRITE, mark;
                    100:        char waste[BUFSIZ];
                    101: 
                    102:        /* flush local terminal output */
                    103:        ioctl(1, TIOCFLUSH, (char *)&out);
                    104:        for (;;) {
                    105:                if (ioctl(rem, SIOCATMARK, &mark) < 0) {
                    106:                        perror("ioctl");
                    107:                        break;
                    108:                }
                    109:                if (mark)
                    110:                        break;
                    111:                (void) read(rem, waste, sizeof (waste));
                    112:        }
                    113:        if (recv(rem, &mark, 1, MSG_OOB) < 0) {
                    114:                perror("recv");
                    115:                ...
                    116:        }
                    117:        ...
                    118: }
                    119: .DE
                    120: .ce
                    121: Figure 5.  Flushing terminal I/O on receipt of out of band data.
                    122: .sp
                    123: .KE
                    124: .PP
                    125: A process may also read or peek at the out-of-band data
                    126: without first reading up to the mark.
                    127: This is more difficult when the underlying protocol delivers
                    128: the urgent data in-band with the normal data, and only sends
                    129: notification of its presence ahead of time (e.g., the TCP protocol
                    130: used to implement streams in the Internet domain).
                    131: With such protocols, the out-of-band byte may not yet have arrived
                    132: when a \fIrecv\fP is done with the MSG_OOB flag.
                    133: In that case, the call will return an error of EWOULDBLOCK.
                    134: Worse, there may be enough in-band data in the input buffer
                    135: that normal flow control prevents the peer from sending the urgent data
                    136: until the buffer is cleared.
                    137: The process must then read enough of the queued data
                    138: that the urgent data may be delivered.
                    139: .PP
                    140: Certain programs that use multiple bytes of urgent data and must
                    141: handle multiple urgent signals (e.g., \fItelnet\fP\|(1C))
                    142: need to retain the position of urgent data within the stream.
                    143: This treatment is available as a socket-level option, SO_OOBINLINE;
                    144: see \fIsetsockopt\fP\|(2) for usage.
                    145: With this option, the position of urgent data (the \*(lqmark\*(rq)
                    146: is retained, but the urgent data immediately follows the mark
                    147: within the normal data stream returned without the MSG_OOB flag.
                    148: Reception of multiple urgent indications causes the mark to move,
                    149: but no out-of-band data are lost.
                    150: .NH 2
                    151: Non-Blocking Sockets
                    152: .PP
                    153: It is occasionally convenient to make use of sockets
                    154: which do not block; that is, I/O requests which
                    155: cannot complete immediately and
                    156: would therefore cause the process to be suspended awaiting completion are
                    157: not executed, and an error code is returned.
                    158: Once a socket has been created via
                    159: the \fIsocket\fP call, it may be marked as non-blocking
                    160: by \fIfcntl\fP as follows:
                    161: .DS
                    162: #include <fcntl.h>
                    163:  ...
                    164: int    s;
                    165:  ...
                    166: s = socket(AF_INET, SOCK_STREAM, 0);
                    167:  ...
                    168: if (fcntl(s, F_SETFL, FNDELAY) < 0)
                    169:        perror("fcntl F_SETFL, FNDELAY");
                    170:        exit(1);
                    171: }
                    172:  ...
                    173: .DE
                    174: .PP
                    175: When performing non-blocking I/O on sockets, one must be
                    176: careful to check for the error EWOULDBLOCK (stored in the
                    177: global variable \fIerrno\fP), which occurs when
                    178: an operation would normally block, but the socket it
                    179: was performed on is marked as non-blocking.
                    180: In particular, \fIaccept\fP, \fIconnect\fP, \fIsend\fP, \fIrecv\fP,
                    181: \fIread\fP, and \fIwrite\fP can
                    182: all return EWOULDBLOCK, and processes should be prepared
                    183: to deal with such return codes.
                    184: If an operation such as a \fIsend\fP cannot be done in its entirety,
                    185: but partial writes are sensible (for example, when using a stream socket),
                    186: the data that can be sent immediately will be processed,
                    187: and the return value will indicate the amount actually sent.
                    188: .NH 2
                    189: Interrupt driven socket I/O
                    190: .PP
                    191: The SIGIO signal allows a process to be notified
                    192: via a signal when a socket (or more generally, a file
                    193: descriptor) has data waiting to be read.  Use of
                    194: the SIGIO facility requires three steps:  First,
                    195: the process must set up a SIGIO signal handler
                    196: by use of the \fIsignal\fP or \fIsigvec\fP calls.  Second,
                    197: it must set the process id or process group id which is to receive
                    198: notification of pending input to its own process id,
                    199: or the process group id of its process group (note that
                    200: the default process group of a socket is group zero).
                    201: This is accomplished by use of an \fIfcntl\fP call.
                    202: Third, it must enable asynchronous notification of pending I/O requests
                    203: with another \fIfcntl\fP call.  Sample code to
                    204: allow a given process to receive information on
                    205: pending I/O requests as they occur for a socket \fIs\fP
                    206: is given in Figure 6.  With the addition of a handler for SIGURG,
                    207: this code can also be used to prepare for receipt of SIGURG signals.
                    208: .KF
                    209: .DS
                    210: #include <fcntl.h>
                    211:  ...
                    212: int    io_handler();
                    213:  ...
                    214: signal(SIGIO, io_handler);
                    215: 
                    216: /* Set the process receiving SIGIO/SIGURG signals to us */
                    217: 
                    218: if (fcntl(s, F_SETOWN, getpid()) < 0) {
                    219:        perror("fcntl F_SETOWN");
                    220:        exit(1);
                    221: }
                    222: 
                    223: /* Allow receipt of asynchronous I/O signals */
                    224: 
                    225: if (fcntl(s, F_SETFL, FASYNC) < 0) {
                    226:        perror("fcntl F_SETFL, FASYNC");
                    227:        exit(1);
                    228: }
                    229: .DE
                    230: .ce
                    231: Figure 6.  Use of asynchronous notification of I/O requests.
                    232: .sp
                    233: .KE
                    234: .NH 2
                    235: Signals and process groups
                    236: .PP
                    237: Due to the existence of the SIGURG and SIGIO signals each socket has an
                    238: associated process number, just as is done for terminals.
                    239: This value is initialized to zero,
                    240: but may be redefined at a later time with the F_SETOWN
                    241: \fIfcntl\fP, such as was done in the code above for SIGIO.
                    242: To set the socket's process id for signals, positive arguments
                    243: should be given to the \fIfcntl\fP call.  To set the socket's
                    244: process group for signals, negative arguments should be 
                    245: passed to \fIfcntl\fP.  Note that the process number indicates
                    246: either the associated process id or the associated process
                    247: group; it is impossible to specify both at the same time.
                    248: A similar \fIfcntl\fP, F_GETOWN, is available for determining the
                    249: current process number of a socket.
                    250: .PP
                    251: Another signal which is useful when constructing server processes
                    252: is SIGCHLD.  This signal is delivered to a process when any
                    253: child processes have changed state.  Normally servers use
                    254: the signal to \*(lqreap\*(rq child processes that have exited
                    255: without explicitly awaiting their termination
                    256: or periodic polling for exit status.
                    257: For example, the remote login server loop shown in Figure 2
                    258: may be augmented as shown in Figure 7.
                    259: .KF
                    260: .DS
                    261: int reaper();
                    262:  ...
                    263: signal(SIGCHLD, reaper);
                    264: listen(f, 5);
                    265: for (;;) {
                    266:        int g, len = sizeof (from);
                    267: 
                    268:        g = accept(f, (struct sockaddr *)&from, &len,);
                    269:        if (g < 0) {
                    270:                if (errno != EINTR)
                    271:                        syslog(LOG_ERR, "rlogind: accept: %m");
                    272:                continue;
                    273:        }
                    274:        ...
                    275: }
                    276:  ...
                    277: #include <wait.h>
                    278: reaper()
                    279: {
                    280:        union wait status;
                    281: 
                    282:        while (wait3(&status, WNOHANG, 0) > 0)
                    283:                ;
                    284: }
                    285: .DE
                    286: .sp
                    287: .ce
                    288: Figure 7.  Use of the SIGCHLD signal.
                    289: .sp
                    290: .KE
                    291: .PP
                    292: If the parent server process fails to reap its children,
                    293: a large number of \*(lqzombie\*(rq processes may be created.
                    294: .NH 2
                    295: Pseudo terminals
                    296: .PP
                    297: Many programs will not function properly without a terminal
                    298: for standard input and output.  Since sockets do not provide
                    299: the semantics of terminals,
                    300: it is often necessary to have a process communicating over
                    301: the network do so through a \fIpseudo-terminal\fP.  A pseudo-
                    302: terminal is actually a pair of devices, master and slave,
                    303: which allow a process to serve as an active agent in communication
                    304: between processes and users.  Data written on the slave side
                    305: of a pseudo-terminal is supplied as input to a process reading
                    306: from the master side, while data written on the master side are
                    307: processed as terminal input for the slave.
                    308: In this way, the process manipulating
                    309: the master side of the pseudo-terminal has control over the
                    310: information read and written on the slave side
                    311: as if it were manipulating the keyboard and reading the screen
                    312: on a real terminal.
                    313: The purpose of this abstraction is to
                    314: preserve terminal semantics over a network connection\(em
                    315: that is, the slave side appears as a normal terminal to
                    316: any process reading from or writing to it.
                    317: .PP
                    318: For example, the remote
                    319: login server uses pseudo-terminals for remote login sessions.
                    320: A user logging in to a machine across the network is provided
                    321: a shell with a slave pseudo-terminal as standard input, output,
                    322: and error.  The server process then handles the communication
                    323: between the programs invoked by the remote shell and the user's
                    324: local client process.
                    325: When a user sends a character that generates an interrupt
                    326: on the remote machine that flushes terminal output,
                    327: the pseudo-terminal generates a control message for the server process.
                    328: The server then sends an out of band message
                    329: to the client process to signal a flush of data at the real terminal
                    330: and on the intervening data buffered in the network.
                    331: .PP
                    332: Under 4.3BSD, the name of the slave side of a pseudo-terminal is of the form
                    333: \fI/dev/ttyxy\fP, where \fIx\fP is a single letter
                    334: starting at `p' and continuing to `t'.
                    335: \fIy\fP is a hexadecimal digit (i.e., a single
                    336: character in the range 0 through 9 or `a' through `f').
                    337: The master side of a pseudo-terminal is \fI/dev/ptyxy\fP,
                    338: where \fIx\fP and \fIy\fP correspond to the
                    339: slave side of the pseudo-terminal.
                    340: .PP
                    341: In general, the method of obtaining a pair of master and
                    342: slave pseudo-terminals is to
                    343: find a pseudo-terminal which
                    344: is not currently in use.
                    345: The master half of a pseudo-terminal is a single-open device;
                    346: thus, each master may be opened in turn until an open succeeds.
                    347: The slave side of the pseudo-terminal is then opened,
                    348: and is set to the proper terminal modes if necessary.
                    349: The process then \fIfork\fPs; the child closes
                    350: the master side of the pseudo-terminal, and \fIexec\fPs the
                    351: appropriate program.  Meanwhile, the parent closes the
                    352: slave side of the pseudo-terminal and begins reading and
                    353: writing from the master side.  Sample code making use of
                    354: pseudo-terminals is given in Figure 8; this code assumes
                    355: that a connection on a socket \fIs\fP exists, connected
                    356: to a peer who wants a service of some kind, and that the
                    357: process has disassociated itself from any previous controlling terminal.
                    358: .KF
                    359: .DS
                    360: gotpty = 0;
                    361: for (c = 'p'; !gotpty && c <= 's'; c++) {
                    362:        line = "/dev/ptyXX";
                    363:        line[sizeof("/dev/pty")-1] = c;
                    364:        line[sizeof("/dev/ptyp")-1] = '0';
                    365:        if (stat(line, &statbuf) < 0)
                    366:                break;
                    367:        for (i = 0; i < 16; i++) {
                    368:                line[sizeof("/dev/ptyp")-1] = "0123456789abcdef"[i];
                    369:                master = open(line, O_RDWR);
                    370:                if (master > 0) {
                    371:                        gotpty = 1;
                    372:                        break;
                    373:                }
                    374:        }
                    375: }
                    376: if (!gotpty) {
                    377:        syslog(LOG_ERR, "All network ports in use");
                    378:        exit(1);
                    379: }
                    380: 
                    381: line[sizeof("/dev/")-1] = 't';
                    382: slave = open(line, O_RDWR);    /* \fIslave\fP is now slave side */
                    383: if (slave < 0) {
                    384:        syslog(LOG_ERR, "Cannot open slave pty %s", line);
                    385:        exit(1);
                    386: }
                    387: 
                    388: ioctl(slave, TIOCGETP, &b);    /* Set slave tty modes */
                    389: b.sg_flags = CRMOD|XTABS|ANYP;
                    390: ioctl(slave, TIOCSETP, &b);
                    391: 
                    392: i = fork();
                    393: if (i < 0) {
                    394:        syslog(LOG_ERR, "fork: %m");
                    395:        exit(1);
                    396: } else if (i) {                /* Parent */
                    397:        close(slave);
                    398:        ...
                    399: } else {                /* Child */
                    400:        (void) close(s);
                    401:        (void) close(master);
                    402:        dup2(slave, 0);
                    403:        dup2(slave, 1);
                    404:        dup2(slave, 2);
                    405:        if (slave > 2)
                    406:                (void) close(slave);
                    407:        ...
                    408: }
                    409: .DE
                    410: .ce
                    411: Figure 8.  Creation and use of a pseudo terminal
                    412: .sp
                    413: .KE
                    414: .NH 2
                    415: Selecting specific protocols
                    416: .PP
                    417: If the third argument to the \fIsocket\fP call is 0,
                    418: \fIsocket\fP will select a default protocol to use with
                    419: the returned socket of the type requested.
                    420: The default protocol is usually correct, and alternate choices are not
                    421: usually available.
                    422: However, when using ``raw'' sockets to communicate directly with
                    423: lower-level protocols or hardware interfaces,
                    424: the protocol argument may be important for setting up demultiplexing.
                    425: For example, raw sockets in the Internet family may be used to implement
                    426: a new protocol above IP, and the socket will receive packets
                    427: only for the protocol specified.
                    428: To obtain a particular protocol one determines the protocol number
                    429: as defined within the communication domain.  For the Internet
                    430: domain one may use one of the library routines
                    431: discussed in section 3, such as \fIgetprotobyname\fP:
                    432: .DS
                    433: #include <sys/types.h>
                    434: #include <sys/socket.h>
                    435: #include <netinet/in.h>
                    436: #include <netdb.h>
                    437:  ...
                    438: pp = getprotobyname("newtcp");
                    439: s = socket(AF_INET, SOCK_STREAM, pp->p_proto);
                    440: .DE
                    441: This would result in a socket \fIs\fP using a stream
                    442: based connection, but with protocol type of ``newtcp''
                    443: instead of the default ``tcp.''
                    444: .PP
                    445: In the NS domain, the available socket protocols are defined in
                    446: <\fInetns/ns.h\fP>.  To create a raw socket for Xerox Error Protocol
                    447: messages, one might use:
                    448: .DS
                    449: #include <sys/types.h>
                    450: #include <sys/socket.h>
                    451: #include <netns/ns.h>
                    452:  ...
                    453: s = socket(AF_NS, SOCK_RAW, NSPROTO_ERROR);
                    454: .DE
                    455: .NH 2
                    456: Address binding
                    457: .PP
                    458: As was mentioned in section 2, 
                    459: binding addresses to sockets in the Internet and NS domains can be
                    460: fairly complex.  As a brief reminder, these associations
                    461: are composed of local and foreign
                    462: addresses, and local and foreign ports.  Port numbers are
                    463: allocated out of separate spaces, one for each system and one
                    464: for each domain on that system.
                    465: Through the \fIbind\fP system call, a
                    466: process may specify half of an association, the
                    467: <local address, local port> part, while the
                    468: \fIconnect\fP
                    469: and \fIaccept\fP
                    470: primitives are used to complete a socket's association by
                    471: specifying the <foreign address, foreign port> part.
                    472: Since the association is created in two steps the association
                    473: uniqueness requirement indicated previously could be violated unless
                    474: care is taken.  Further, it is unrealistic to expect user
                    475: programs to always know proper values to use for the local address
                    476: and local port since a host may reside on multiple networks and
                    477: the set of allocated port numbers is not directly accessible
                    478: to a user.
                    479: .PP
                    480: To simplify local address binding in the Internet domain the notion of a
                    481: \*(lqwildcard\*(rq address has been provided.  When an address
                    482: is specified as INADDR_ANY (a manifest constant defined in
                    483: <netinet/in.h>), the system interprets the address as 
                    484: \*(lqany valid address\*(rq.  For example, to bind a specific
                    485: port number to a socket, but leave the local address unspecified,
                    486: the following code might be used:
                    487: .DS
                    488: #include <sys/types.h>
                    489: #include <netinet/in.h>
                    490:  ...
                    491: struct sockaddr_in sin;
                    492:  ...
                    493: s = socket(AF_INET, SOCK_STREAM, 0);
                    494: sin.sin_family = AF_INET;
                    495: sin.sin_addr.s_addr = htonl(INADDR_ANY);
                    496: sin.sin_port = htons(MYPORT);
                    497: bind(s, (struct sockaddr *) &sin, sizeof (sin));
                    498: .DE
                    499: Sockets with wildcarded local addresses may receive messages
                    500: directed to the specified port number, and sent to any
                    501: of the possible addresses assigned to a host.  For example,
                    502: if a host has addresses 128.32.0.4 and 10.0.0.78, and a socket is bound as
                    503: above, the process will be
                    504: able to accept connection requests which are addressed to
                    505: 128.32.0.4 or 10.0.0.78.
                    506: If a server process wished to only allow hosts on a
                    507: given network connect to it, it would bind
                    508: the address of the host on the appropriate network.
                    509: .PP
                    510: In a similar fashion, a local port may be left unspecified
                    511: (specified as zero), in which case the system will select an
                    512: appropriate port number for it.  This shortcut will work
                    513: both in the Internet and NS domains.  For example, to
                    514: bind a specific local address to a socket, but to leave the
                    515: local port number unspecified:
                    516: .DS
                    517: hp = gethostbyname(hostname);
                    518: if (hp == NULL) {
                    519:        ...
                    520: }
                    521: bcopy(hp->h_addr, (char *) sin.sin_addr, hp->h_length);
                    522: sin.sin_port = htons(0);
                    523: bind(s, (struct sockaddr *) &sin, sizeof (sin));
                    524: .DE
                    525: The system selects the local port number based on two criteria.
                    526: The first is that on 4BSD systems,
                    527: Internet ports below IPPORT_RESERVED (1024) (for the Xerox domain,
                    528: 0 through 3000) are reserved
                    529: for privileged users (i.e., the super user);
                    530: Internet ports above IPPORT_USERRESERVED (50000) are reserved
                    531: for non-privileged servers.  The second is
                    532: that the port number is not currently bound to some other
                    533: socket.  In order to find a free Internet port number in the privileged
                    534: range the \fIrresvport\fP library routine may be used as follows
                    535: to return a stream socket in with a privileged port number:
                    536: .DS
                    537: int lport = IPPORT_RESERVED \- 1;
                    538: int s;
                    539: ...
                    540: s = rresvport(&lport);
                    541: if (s < 0) {
                    542:        if (errno == EAGAIN)
                    543:                fprintf(stderr, "socket: all ports in use\en");
                    544:        else
                    545:                perror("rresvport: socket");
                    546:        ...
                    547: }
                    548: .DE
                    549: The restriction on allocating ports was done to allow processes
                    550: executing in a \*(lqsecure\*(rq environment to perform authentication
                    551: based on the originating address and port number.  For example,
                    552: the \fIrlogin\fP(1) command allows users to log in across a network
                    553: without being asked for a password, if two conditions hold:
                    554: First, the name of the system the user
                    555: is logging in from is in the file
                    556: \fI/etc/hosts.equiv\fP on the system he is logging
                    557: in to (or the system name and the user name are in
                    558: the user's \fI.rhosts\fP file in the user's home
                    559: directory), and second, that the user's rlogin
                    560: process is coming from a privileged port on the machine from which he is
                    561: logging.  The port number and network address of the
                    562: machine from which the user is logging in can be determined either
                    563: by the \fIfrom\fP result of the \fIaccept\fP call, or
                    564: from the \fIgetpeername\fP call.
                    565: .PP
                    566: In certain cases the algorithm used by the system in selecting
                    567: port numbers is unsuitable for an application.  This is because
                    568: associations are created in a two step process.  For example,
                    569: the Internet file transfer protocol, FTP, specifies that data
                    570: connections must always originate from the same local port.  However,
                    571: duplicate associations are avoided by connecting to different foreign
                    572: ports.  In this situation the system would disallow binding the
                    573: same local address and port number to a socket if a previous data
                    574: connection's socket still existed.  To override the default port
                    575: selection algorithm, an option call must be performed prior
                    576: to address binding:
                    577: .DS
                    578:  ...
                    579: int    on = 1;
                    580:  ...
                    581: setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &on, sizeof(on));
                    582: bind(s, (struct sockaddr *) &sin, sizeof (sin));
                    583: .DE
                    584: With the above call, local addresses may be bound which
                    585: are already in use.  This does not violate the uniqueness
                    586: requirement as the system still checks at connect time to
                    587: be sure any other sockets with the same local address and
                    588: port do not have the same foreign address and port.
                    589: If the association already exists, the error EADDRINUSE is returned.
                    590: .NH 2
                    591: Broadcasting and determining network configuration
                    592: .PP
                    593: By using a datagram socket, it is possible to send broadcast
                    594: packets on many networks supported by the system.
                    595: The network itself must support broadcast; the system
                    596: provides no simulation of broadcast in software.
                    597: Broadcast messages can place a high load on a network since they force
                    598: every host on the network to service them.  Consequently,
                    599: the ability to send broadcast packets has been limited
                    600: to sockets which are explicitly marked as allowing broadcasting.
                    601: Broadcast is typically used for one of two reasons:
                    602: it is desired to find a resource on a local network without prior
                    603: knowledge of its address,
                    604: or important functions such as routing require that information
                    605: be sent to all accessible neighbors.
                    606: .PP
                    607: To send a broadcast message, a datagram socket 
                    608: should be created:
                    609: .DS
                    610: s = socket(AF_INET, SOCK_DGRAM, 0);
                    611: .DE
                    612: or
                    613: .DS
                    614: s = socket(AF_NS, SOCK_DGRAM, 0);
                    615: .DE
                    616: The socket is marked as allowing broadcasting,
                    617: .DS
                    618: int    on = 1;
                    619: 
                    620: setsockopt(s, SOL_SOCKET, SO_BROADCAST, &on, sizeof (on));
                    621: .DE
                    622: and at least a port number should be bound to the socket:
                    623: .DS
                    624: sin.sin_family = AF_INET;
                    625: sin.sin_addr.s_addr = htonl(INADDR_ANY);
                    626: sin.sin_port = htons(MYPORT);
                    627: bind(s, (struct sockaddr *) &sin, sizeof (sin));
                    628: .DE
                    629: or, for the NS domain,
                    630: .DS
                    631: sns.sns_family = AF_NS;
                    632: netnum = htonl(net);
                    633: sns.sns_addr.x_net = *(union ns_net *) &netnum; /* insert net number */
                    634: sns.sns_addr.x_port = htons(MYPORT);
                    635: bind(s, (struct sockaddr *) &sns, sizeof (sns));
                    636: .DE
                    637: The destination address of the message to be broadcast
                    638: depends on the network(s) on which the message is to be broadcast.
                    639: The Internet domain supports a shorthand notation for broadcast
                    640: on the local network, the address INADDR_BROADCAST (defined in
                    641: <\fInetinet/in.h\fP>.
                    642: To determine the list of addresses for all reachable neighbors
                    643: requires knowledge of the networks to which the host is connected.
                    644: Since this information should
                    645: be obtained in a host-independent fashion and may be impossible
                    646: to derive, 4.3BSD provides a method of
                    647: retrieving this information from the system data structures.
                    648: The SIOCGIFCONF \fIioctl\fP call returns the interface
                    649: configuration of a host in the form of a
                    650: single \fIifconf\fP structure; this structure contains
                    651: a ``data area'' which is made up of an array of
                    652: of \fIifreq\fP structures, one for each network interface
                    653: to which the host is connected.
                    654: These structures are defined in
                    655: \fI<net/if.h>\fP as follows:
                    656: .DS
                    657: .if t .ta .5i 1.0i 1.5i 3.5i
                    658: .if n .ta .7i 1.4i 2.1i 3.4i
                    659: struct ifconf {
                    660:        int     ifc_len;                /* size of associated buffer */
                    661:        union {
                    662:                caddr_t ifcu_buf;
                    663:                struct  ifreq *ifcu_req;
                    664:        } ifc_ifcu;
                    665: };
                    666: 
                    667: #define        ifc_buf ifc_ifcu.ifcu_buf               /* buffer address */
                    668: #define        ifc_req ifc_ifcu.ifcu_req               /* array of structures returned */
                    669: 
                    670: #define        IFNAMSIZ        16
                    671: 
                    672: struct ifreq {
                    673:        char    ifr_name[IFNAMSIZ];             /* if name, e.g. "en0" */
                    674:        union {
                    675:                struct  sockaddr ifru_addr;
                    676:                struct  sockaddr ifru_dstaddr;
                    677:                struct  sockaddr ifru_broadaddr;
                    678:                short   ifru_flags;
                    679:                caddr_t ifru_data;
                    680:        } ifr_ifru;
                    681: };
                    682: 
                    683: .if t .ta \w'  #define'u +\w'  ifr_broadaddr'u +\w'  ifr_ifru.ifru_broadaddr'u
                    684: #define        ifr_addr        ifr_ifru.ifru_addr      /* address */
                    685: #define        ifr_dstaddr     ifr_ifru.ifru_dstaddr   /* other end of p-to-p link */
                    686: #define        ifr_broadaddr   ifr_ifru.ifru_broadaddr /* broadcast address */
                    687: #define        ifr_flags       ifr_ifru.ifru_flags     /* flags */
                    688: #define        ifr_data        ifr_ifru.ifru_data      /* for use by interface */
                    689: .DE
                    690: The actual call which obtains the
                    691: interface configuration is
                    692: .DS
                    693: struct ifconf ifc;
                    694: char buf[BUFSIZ];
                    695: 
                    696: ifc.ifc_len = sizeof (buf);
                    697: ifc.ifc_buf = buf;
                    698: if (ioctl(s, SIOCGIFCONF, (char *) &ifc) < 0) {
                    699:        ...
                    700: }
                    701: .DE
                    702: After this call \fIbuf\fP will contain one \fIifreq\fP structure for
                    703: each network to which the host is connected, and
                    704: \fIifc.ifc_len\fP will have been modified to reflect the number
                    705: of bytes used by the \fIifreq\fP structures.
                    706: .PP
                    707: For each structure
                    708: there exists a set of ``interface flags'' which tell
                    709: whether the network corresponding to that interface is
                    710: up or down, point to point or broadcast, etc.  The
                    711: SIOCGIFFLAGS \fIioctl\fP retrieves these
                    712: flags for an interface specified by an \fIifreq\fP
                    713: structure as follows:
                    714: .DS
                    715: struct ifreq *ifr;
                    716: 
                    717: ifr = ifc.ifc_req;
                    718: 
                    719: for (n = ifc.ifc_len / sizeof (struct ifreq); --n >= 0; ifr++) {
                    720:        /*
                    721:         * We must be careful that we don't use an interface
                    722:         * devoted to an address family other than those intended;
                    723:         * if we were interested in NS interfaces, the
                    724:         * AF_INET would be AF_NS.
                    725:         */
                    726:        if (ifr->ifr_addr.sa_family != AF_INET)
                    727:                continue;
                    728:        if (ioctl(s, SIOCGIFFLAGS, (char *) ifr) < 0) {
                    729:                ...
                    730:        }
                    731:        /*
                    732:         * Skip boring cases.
                    733:         */
                    734:        if ((ifr->ifr_flags & IFF_UP) == 0 ||
                    735:            (ifr->ifr_flags & IFF_LOOPBACK) ||
                    736:            (ifr->ifr_flags & (IFF_BROADCAST | IFF_POINTTOPOINT)) == 0)
                    737:                continue;
                    738: .DE
                    739: .PP
                    740: Once the flags have been obtained, the broadcast address 
                    741: must be obtained.  In the case of broadcast networks this is
                    742: done via the SIOCGIFBRDADDR \fIioctl\fP, while for point-to-point networks
                    743: the address of the destination host is obtained with SIOCGIFDSTADDR.
                    744: .DS
                    745: struct sockaddr dst;
                    746: 
                    747: if (ifr->ifr_flags & IFF_POINTTOPOINT) {
                    748:        if (ioctl(s, SIOCGIFDSTADDR, (char *) ifr) < 0) {
                    749:                ...
                    750:        }
                    751:        bcopy((char *) ifr->ifr_dstaddr, (char *) &dst, sizeof (ifr->ifr_dstaddr));
                    752: } else if (ifr->ifr_flags & IFF_BROADCAST) {
                    753:        if (ioctl(s, SIOCGIFBRDADDR, (char *) ifr) < 0) {
                    754:                ...
                    755:        }
                    756:        bcopy((char *) ifr->ifr_broadaddr, (char *) &dst, sizeof (ifr->ifr_broadaddr));
                    757: }
                    758: .DE
                    759: .PP
                    760: After the appropriate \fIioctl\fP's have obtained the broadcast
                    761: or destination address (now in \fIdst\fP), the \fIsendto\fP call may be
                    762: used:
                    763: .DS
                    764:        sendto(s, buf, buflen, 0, (struct sockaddr *)&dst, sizeof (dst));
                    765: }
                    766: .DE
                    767: In the above loop one \fIsendto\fP occurs for every
                    768: interface to which the host is connected that supports the notion of
                    769: broadcast or point-to-point addressing.
                    770: If a process only wished to send broadcast
                    771: messages on a given network, code similar to that outlined above
                    772: would be used, but the loop would need to find the
                    773: correct destination address.
                    774: .PP
                    775: Received broadcast messages contain the senders address
                    776: and port, as datagram sockets are bound before
                    777: a message is allowed to go out.
                    778: .NH 2
                    779: Socket Options
                    780: .PP
                    781: It is possible to set and get a number of options on sockets
                    782: via the \fIsetsockopt\fP and \fIgetsockopt\fP system calls.
                    783: These options include such things as marking a socket for
                    784: broadcasting, not to route, to linger on close, etc.
                    785: The general forms of the calls are:
                    786: .DS
                    787: setsockopt(s, level, optname, optval, optlen);
                    788: .DE
                    789: and
                    790: .DS
                    791: getsockopt(s, level, optname, optval, optlen);
                    792: .DE
                    793: .PP
                    794: The parameters to the calls are as follows: \fIs\fP
                    795: is the socket on which the option is to be applied.
                    796: \fILevel\fP specifies the protocol layer on which the
                    797: option is to be applied; in most cases this is
                    798: the ``socket level'', indicated by the symbolic constant
                    799: SOL_SOCKET, defined in \fI<sys/socket.h>.\fP
                    800: The actual option is specified in \fIoptname\fP, and is
                    801: a symbolic constant also defined in \fI<sys/socket.h>\fP.
                    802: \fIOptval\fP and \fIOptlen\fP point to the value of the
                    803: option (in most cases, whether the option is to be turned
                    804: on or off), and the length of the value of the option,
                    805: respectively.
                    806: For \fIgetsockopt\fP, \fIoptlen\fP is
                    807: a value-result parameter, initially set to the size of
                    808: the storage area pointed to by \fIoptval\fP, and modified
                    809: upon return to indicate the actual amount of storage used.
                    810: .PP
                    811: An example should help clarify things.  It is sometimes
                    812: useful to determine the type (e.g., stream, datagram, etc.)
                    813: of an existing socket; programs
                    814: under \fIinetd\fP (described below) may need to perform this
                    815: task.  This can be accomplished as follows via the
                    816: SO_TYPE socket option and the \fIgetsockopt\fP call:
                    817: .DS
                    818: #include <sys/types.h>
                    819: #include <sys/socket.h>
                    820: 
                    821: int type, size;
                    822: 
                    823: size = sizeof (int);
                    824: 
                    825: if (getsockopt(s, SOL_SOCKET, SO_TYPE, (char *) &type, &size) < 0) {
                    826:        ...
                    827: }
                    828: .DE
                    829: After the \fIgetsockopt\fP call, \fItype\fP will be set
                    830: to the value of the socket type, as defined in
                    831: \fI<sys/socket.h>\fP.  If, for example, the socket were
                    832: a datagram socket, \fItype\fP would have the value
                    833: corresponding to SOCK_DGRAM.
                    834: .NH 2
                    835: NS Packet Sequences
                    836: .PP
                    837: The semantics of NS connections demand that
                    838: the user both be able to look inside the network header associated
                    839: with any incoming packet and be able to specify what should go
                    840: in certain fields of an outgoing packet.
                    841: Using different calls to \fIsetsockopt\fP, it is possible
                    842: to indicate whether prototype headers will be associated by
                    843: the user with each outgoing packet (SO_HEADERS_ON_OUTPUT),
                    844: to indicate whether the headers received by the system should be
                    845: delivered to the user (SO_HEADERS_ON_INPUT), or to indicate
                    846: default information that should be associated with all
                    847: outgoing packets on a given socket (SO_DEFAULT_HEADERS).
                    848: .PP
                    849: The contents of a SPP header (minus the IDP header) are:
                    850: .DS
                    851: .if t .ta \w"  #define"u +\w"  u_short"u +2.0i
                    852: struct sphdr {
                    853:        u_char  sp_cc;          /* connection control */
                    854: #define        SP_SP   0x80            /* system packet */
                    855: #define        SP_SA   0x40            /* send acknowledgement */
                    856: #define        SP_OB   0x20            /* attention (out of band data) */
                    857: #define        SP_EM   0x10            /* end of message */
                    858:        u_char  sp_dt;          /* datastream type */
                    859:        u_short sp_sid;         /* source connection identifier */
                    860:        u_short sp_did;         /* destination connection identifier */
                    861:        u_short sp_seq;         /* sequence number */
                    862:        u_short sp_ack;         /* acknowledge number */
                    863:        u_short sp_alo;         /* allocation number */
                    864: };
                    865: .DE
                    866: Here, the items of interest are the \fIdatastream type\fP and
                    867: the \fIconnection control\fP fields.  The semantics of the
                    868: datastream type are defined by the application(s) in question;
                    869: the value of this field is, by default, zero, but it can be
                    870: used to indicate things such as Xerox's Bulk Data Transfer
                    871: Protocol (in which case it is set to one).  The connection control
                    872: field is a mask of the flags defined just below it.  The user may
                    873: set or clear the end-of-message bit to indicate
                    874: that a given message is the last of a given substream type,
                    875: or may set/clear the attention bit as an alternate way to
                    876: indicate that a packet should be sent out-of-band.
                    877: As an example, to associate prototype headers with outgoing
                    878: SPP packets, consider:
                    879: .DS
                    880: #include <sys/types.h>
                    881: #include <sys/socket.h>
                    882: #include <netns/ns.h>
                    883: #include <netns/sp.h>
                    884:  ...
                    885: struct sockaddr_ns sns, to;
                    886: int s, on = 1;
                    887: struct databuf {
                    888:        struct sphdr proto_spp; /* prototype header */
                    889:        char buf[534];          /* max. possible data by Xerox std. */
                    890: } buf;
                    891:  ...
                    892: s = socket(AF_NS, SOCK_SEQPACKET, 0);
                    893:  ...
                    894: bind(s, (struct sockaddr *) &sns, sizeof (sns));
                    895: setsockopt(s, NSPROTO_SPP, SO_HEADERS_ON_OUTPUT, &on, sizeof(on));
                    896:  ...
                    897: buf.proto_spp.sp_dt = 1;       /* bulk data */
                    898: buf.proto_spp.sp_cc = SP_EM;   /* end-of-message */
                    899: strcpy(buf.buf, "hello world\en");
                    900: sendto(s, (char *) &buf, sizeof(struct sphdr) + strlen("hello world\en"),
                    901:     (struct sockaddr *) &to, sizeof(to));
                    902:  ...
                    903: .DE
                    904: Note that one must be careful when writing headers; if the prototype
                    905: header is not written with the data with which it is to be associated,
                    906: the kernel will treat the first few bytes of the data as the
                    907: header, with unpredictable results.
                    908: To turn off the above association, and to indicate that packet
                    909: headers received by the system should be passed up to the user,
                    910: one might use:
                    911: .DS
                    912: #include <sys/types.h>
                    913: #include <sys/socket.h>
                    914: #include <netns/ns.h>
                    915: #include <netns/sp.h>
                    916:  ...
                    917: struct sockaddr sns;
                    918: int s, on = 1, off = 0;
                    919:  ...
                    920: s = socket(AF_NS, SOCK_SEQPACKET, 0);
                    921:  ...
                    922: bind(s, (struct sockaddr *) &sns, sizeof (sns));
                    923: setsockopt(s, NSPROTO_SPP, SO_HEADERS_ON_OUTPUT, &off, sizeof(off));
                    924: setsockopt(s, NSPROTO_SPP, SO_HEADERS_ON_INPUT, &on, sizeof(on));
                    925:  ...
                    926: .DE
                    927: .PP
                    928: Output is handled somewhat differently in the IDP world.
                    929: The header of an IDP-level packet looks like:
                    930: .DS
                    931: .if t .ta \w'struct  'u +\w"  struct ns_addr"u +2.0i
                    932: struct idp {
                    933:        u_short idp_sum;        /* Checksum */
                    934:        u_short idp_len;        /* Length, in bytes, including header */
                    935:        u_char  idp_tc;         /* Transport Control (i.e., hop count) */
                    936:        u_char  idp_pt;         /* Packet Type (i.e., level 2 protocol) */
                    937:        struct ns_addr  idp_dna;        /* Destination Network Address */
                    938:        struct ns_addr  idp_sna;        /* Source Network Address */
                    939: };
                    940: .DE
                    941: The primary field of interest in an IDP header is the \fIpacket type\fP
                    942: field.  The standard values for this field are (as defined
                    943: in <\fInetns/ns.h\fP>):
                    944: .DS
                    945: .if t .ta \w"  #define"u +\w"  NSPROTO_ERROR"u +1.0i
                    946: #define NSPROTO_RI     1               /* Routing Information */
                    947: #define NSPROTO_ECHO   2               /* Echo Protocol */
                    948: #define NSPROTO_ERROR  3               /* Error Protocol */
                    949: #define NSPROTO_PE     4               /* Packet Exchange */
                    950: #define NSPROTO_SPP    5               /* Sequenced Packet */
                    951: .DE
                    952: For SPP connections, the contents of this field are
                    953: automatically set to NSPROTO_SPP; for IDP packets,
                    954: this value defaults to zero, which means ``unknown''.
                    955: .PP
                    956: Setting the value of that field with SO_DEFAULT_HEADERS is
                    957: easy:
                    958: .DS
                    959: #include <sys/types.h>
                    960: #include <sys/socket.h>
                    961: #include <netns/ns.h>
                    962: #include <netns/idp.h>
                    963:  ...
                    964: struct sockaddr sns;
                    965: struct idp proto_idp;          /* prototype header */
                    966: int s, on = 1;
                    967:  ...
                    968: s = socket(AF_NS, SOCK_DGRAM, 0);
                    969:  ...
                    970: bind(s, (struct sockaddr *) &sns, sizeof (sns));
                    971: proto_idp.idp_pt = NSPROTO_PE; /* packet exchange */
                    972: setsockopt(s, NSPROTO_IDP, SO_DEFAULT_HEADERS, (char *) &proto_idp,
                    973:     sizeof(proto_idp));
                    974:  ...
                    975: .DE
                    976: .PP
                    977: Using SO_HEADERS_ON_OUTPUT is somewhat more difficult.  When
                    978: SO_HEADERS_ON_OUTPUT is turned on for an IDP socket, the socket
                    979: becomes (for all intents and purposes) a raw socket.  In this
                    980: case, all the fields of the prototype header (except the 
                    981: length and checksum fields, which are computed by the kernel)
                    982: must be filled in correctly in order for the socket to send and
                    983: receive data in a sensible manner.  To be more specific, the
                    984: source address must be set to that of the host sending the
                    985: data; the destination address must be set to that of the
                    986: host for whom the data is intended; the packet type must be
                    987: set to whatever value is desired; and the hopcount must be
                    988: set to some reasonable value (almost always zero).  It should
                    989: also be noted that simply sending data using \fIwrite\fP
                    990: will not work unless a \fIconnect\fP or \fIsendto\fP call
                    991: is used, in spite of the fact that it is the destination
                    992: address in the prototype header that is used, not the one
                    993: given in either of those calls.  For almost
                    994: all IDP applications , using SO_DEFAULT_HEADERS is easier and
                    995: more desirable than writing headers.
                    996: .NH 2
                    997: Three-way Handshake
                    998: .PP
                    999: The semantics of SPP connections indicates that a three-way
                   1000: handshake, involving changes in the datastream type, should \(em
                   1001: but is not absolutely required to \(em take place before a SPP
                   1002: connection is closed.  Almost all SPP connections are
                   1003: ``well-behaved'' in this manner; when communicating with
                   1004: any process, it is best to assume that the three-way handshake
                   1005: is required unless it is known for certain that it is not
                   1006: required.  In a three-way close, the closing process
                   1007: indicates that it wishes to close the connection by sending
                   1008: a zero-length packet with end-of-message set and with
                   1009: datastream type 254.  The other side of the connection
                   1010: indicates that it is OK to close by sending a zero-length
                   1011: packet with end-of-message set and datastream type 255.  Finally,
                   1012: the closing process replies with a zero-length packet with 
                   1013: substream type 255; at this point, the connection is considered
                   1014: closed.  The following code fragments are simplified examples
                   1015: of how one might handle this three-way handshake at the user
                   1016: level; in the future, support for this type of close will
                   1017: probably be provided as part of the C library or as part of
                   1018: the kernel.  The first code fragment below illustrates how a process
                   1019: might handle three-way handshake if it sees that the process it
                   1020: is communicating with wants to close the connection:
                   1021: .DS
                   1022: #include <sys/types.h>
                   1023: #include <sys/socket.h>
                   1024: #include <netns/ns.h>
                   1025: #include <netns/sp.h>
                   1026:  ...
                   1027: #ifndef SPPSST_END
                   1028: #define SPPSST_END 254
                   1029: #define SPPSST_ENDREPLY 255
                   1030: #endif
                   1031: struct sphdr proto_sp;
                   1032: int s;
                   1033:  ...
                   1034: read(s, buf, BUFSIZE);
                   1035: if (((struct sphdr *)buf)->sp_dt == SPPSST_END) {
                   1036:        /*
                   1037:         * SPPSST_END indicates that the other side wants to
                   1038:         * close.
                   1039:         */
                   1040:        proto_sp.sp_dt = SPPSST_ENDREPLY;
                   1041:        proto_sp.sp_cc = SP_EM;
                   1042:        setsockopt(s, NSPROTO_SPP, SO_DEFAULT_HEADERS, (char *)&proto_sp,
                   1043:            sizeof(proto_sp));
                   1044:        write(s, buf, 0);
                   1045:        /*
                   1046:         * Write a zero-length packet with datastream type = SPPSST_ENDREPLY
                   1047:         * to indicate that the close is OK with us.  The packet that we
                   1048:         * don't see (because we don't look for it) is another packet
                   1049:         * from the other side of the connection, with SPPSST_ENDREPLY
                   1050:         * on it it, too.  Once that packet is sent, the connection is
                   1051:         * considered closed; note that we really ought to retransmit
                   1052:         * the close for some time if we do not get a reply.
                   1053:         */
                   1054:        close(s);
                   1055: }
                   1056:  ...
                   1057: .DE
                   1058: To indicate to another process that we would like to close the
                   1059: connection, the following code would suffice:
                   1060: .DS
                   1061: #include <sys/types.h>
                   1062: #include <sys/socket.h>
                   1063: #include <netns/ns.h>
                   1064: #include <netns/sp.h>
                   1065:  ...
                   1066: #ifndef SPPSST_END
                   1067: #define SPPSST_END 254
                   1068: #define SPPSST_ENDREPLY 255
                   1069: #endif
                   1070: struct sphdr proto_sp;
                   1071: int s;
                   1072:  ...
                   1073: proto_sp.sp_dt = SPPSST_END;
                   1074: proto_sp.sp_cc = SP_EM;
                   1075: setsockopt(s, NSPROTO_SPP, SO_DEFAULT_HEADERS, (char *)&proto_sp,
                   1076:     sizeof(proto_sp));
                   1077: write(s, buf, 0);      /* send the end request */
                   1078: proto_sp.sp_dt = SPPSST_ENDREPLY;
                   1079: setsockopt(s, NSPROTO_SPP, SO_DEFAULT_HEADERS, (char *)&proto_sp,
                   1080:     sizeof(proto_sp));
                   1081: /*
                   1082:  * We assume (perhaps unwisely)
                   1083:  * that the other side will send the
                   1084:  * ENDREPLY, so we'll just send our final ENDREPLY
                   1085:  * as if we'd seen theirs already.
                   1086:  */
                   1087: write(s, buf, 0);
                   1088: close(s);
                   1089:  ...
                   1090: .DE
                   1091: .NH 2
                   1092: Packet Exchange
                   1093: .PP
                   1094: The Xerox standard protocols include a protocol that is both
                   1095: reliable and datagram-oriented.  This protocol is known as
                   1096: Packet Exchange (PEX or PE) and, like SPP, is layered on top
                   1097: of IDP.  PEX is important for a number of things: Courier
                   1098: remote procedure calls may be expedited through the use
                   1099: of PEX, and many Xerox servers are located by doing a PEX
                   1100: ``BroadcastForServers'' operation.  Although there is no
                   1101: implementation of PEX in the kernel,
                   1102: it may be simulated at the user level with some clever coding
                   1103: and the use of one peculiar \fIgetsockopt\fP.  A PEX packet
                   1104: looks like:
                   1105: .DS
                   1106: .if t .ta \w'struct  'u +\w"  struct idp"u +2.0i
                   1107: /*
                   1108:  * The packet-exchange header shown here is not defined
                   1109:  * as part of any of the system include files.
                   1110:  */
                   1111: struct pex {
                   1112:        struct idp      p_idp;  /* idp header */
                   1113:        u_short ph_id[2];       /* unique transaction ID for pex */
                   1114:        u_short ph_client;      /* client type field for pex */
                   1115: };
                   1116: .DE
                   1117: The \fIph_id\fP field is used to hold a ``unique id'' that
                   1118: is used in duplicate suppression; the \fIph_client\fP
                   1119: field indicates the PEX client type (similar to the packet
                   1120: type field in the IDP header).  PEX reliability stems from the
                   1121: fact that it is an idempotent (``I send a packet to you, you
                   1122: send a packet to me'') protocol.  Processes on each side of
                   1123: the connection may use the unique id to determine if they have
                   1124: seen a given packet before (the unique id field differs on each
                   1125: packet sent) so that duplicates may be detected, and to indicate
                   1126: which message a given packet is in response to.  If a packet with
                   1127: a given unique id is sent and no response is received in a given
                   1128: amount of time, the packet is retransmitted until it is decided
                   1129: that no response will ever be received.  To simulate PEX, one
                   1130: must be able to generate unique ids -- something that is hard to
                   1131: do at the user level with any real guarantee that the id is really
                   1132: unique.  Therefore, a means (via \fIgetsockopt\fP) has been provided
                   1133: for getting unique ids from the kernel.  The following code fragment
                   1134: indicates how to get a unique id:
                   1135: .DS
                   1136: long uniqueid;
                   1137: int s, idsize = sizeof(uniqueid);
                   1138:  ...
                   1139: s = socket(AF_NS, SOCK_DGRAM, 0);
                   1140:  ...
                   1141: /* get id from the kernel -- only on IDP sockets */
                   1142: getsockopt(s, NSPROTO_PE, SO_SEQNO, (char *)&uniqueid, &idsize);
                   1143:  ...
                   1144: .DE
                   1145: The retransmission and duplicate suppression code required to
                   1146: simulate PEX fully is left as an exercise for the reader.
                   1147: .NH 2
                   1148: Inetd
                   1149: .PP
                   1150: One of the daemons provided with 4.3BSD is \fIinetd\fP, the
                   1151: so called ``internet super-server.''  \fIInetd\fP is invoked at boot
                   1152: time, and determines from the file \fI/etc/inetd.conf\fP the
                   1153: servers for which it is to listen.  Once this information has been
                   1154: read and a pristine environment created, \fIinetd\fP proceeds
                   1155: to create one socket for each service it is to listen for,
                   1156: binding the appropriate port number to each socket.
                   1157: .PP
                   1158: \fIInetd\fP then performs a \fIselect\fP on all these
                   1159: sockets for read availability, waiting for somebody wishing
                   1160: a connection to the service corresponding to
                   1161: that socket.  \fIInetd\fP then performs an \fIaccept\fP on
                   1162: the socket in question, \fIfork\fPs, \fIdup\fPs the new
                   1163: socket to file descriptors 0 and 1 (stdin and
                   1164: stdout), closes other open file
                   1165: descriptors, and \fIexec\fPs the appropriate server.
                   1166: .PP
                   1167: Servers making use of \fIinetd\fP are considerably simplified,
                   1168: as \fIinetd\fP takes care of the majority of the IPC work
                   1169: required in establishing a connection.  The server invoked
                   1170: by \fIinetd\fP expects the socket connected to its client
                   1171: on file descriptors 0 and 1, and may immediately perform
                   1172: any operations such as \fIread\fP, \fIwrite\fP, \fIsend\fP,
                   1173: or \fIrecv\fP.  Indeed, servers may use
                   1174: buffered I/O as provided by the ``stdio'' conventions, as
                   1175: long as as they remember to use \fIfflush\fP when appropriate.
                   1176: .PP
                   1177: One call which may be of interest to individuals writing
                   1178: servers under \fIinetd\fP is the \fIgetpeername\fP call,
                   1179: which returns the address of the peer (process) connected
                   1180: on the other end of the socket.  For example, to log the
                   1181: Internet address in ``dot notation'' (e.g., ``128.32.0.4'')
                   1182: of a client connected to a server under
                   1183: \fIinetd\fP, the following code might be used:
                   1184: .DS
                   1185: struct sockaddr_in name;
                   1186: int namelen = sizeof (name);
                   1187:  ...
                   1188: if (getpeername(0, (struct sockaddr *)&name, &namelen) < 0) {
                   1189:        syslog(LOG_ERR, "getpeername: %m");
                   1190:        exit(1);
                   1191: } else
                   1192:        syslog(LOG_INFO, "Connection from %s", inet_ntoa(name.sin_addr));
                   1193:  ...
                   1194: .DE
                   1195: While the \fIgetpeername\fP call is especially useful when
                   1196: writing programs to run with \fIinetd\fP, it can be used
                   1197: under other circumstances.  Be warned, however, that \fIgetpeername\fP will
                   1198: fail on UNIX domain sockets.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.