Annotation of 43BSDReno/share/doc/ps1/07.ipctut/tutor.me, revision 1.1.1.1

1.1       root        1: .\" Copyright (c) 1986 The Regents of the University of California.
                      2: .\" All rights reserved.
                      3: .\"
                      4: .\" Redistribution and use in source and binary forms are permitted
                      5: .\" provided that the above copyright notice and this paragraph are
                      6: .\" duplicated in all such forms and that any documentation,
                      7: .\" advertising materials, and other materials related to such
                      8: .\" distribution and use acknowledge that the software was developed
                      9: .\" by the University of California, Berkeley.  The name of the
                     10: .\" University may not be used to endorse or promote products derived
                     11: .\" from this software without specific prior written permission.
                     12: .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
                     13: .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
                     14: .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
                     15: .\"
                     16: .\"    @(#)tutor.me    6.6 (Berkeley) 3/7/89
                     17: .\"
                     18: .oh 'Introductory 4.3BSD IPC''PS1:7-%'
                     19: .eh 'PS1:7-%''Introductory 4.3BSD IPC'
                     20: .rs
                     21: .sp 2
                     22: .sz 14
                     23: .ft B
                     24: .ce 2
                     25: An Introductory 4.3BSD
                     26: Interprocess Communication Tutorial
                     27: .sz 10
                     28: .sp 2
                     29: .ce
                     30: .i "Stuart Sechrest"
                     31: .ft
                     32: .sp
                     33: .ce 4
                     34: Computer Science Research Group
                     35: Computer Science Division
                     36: Department of Electrical Engineering and Computer Science
                     37: University of California, Berkeley
                     38: .sp 2
                     39: .ce
                     40: .i ABSTRACT
                     41: .sp
                     42: .(c
                     43: .pp
                     44: Berkeley UNIX\(dg 4.3BSD offers several choices for interprocess communication.
                     45: To aid the programmer in  developing programs which are comprised of
                     46: cooperating
                     47: processes, the different choices are discussed and a series of example 
                     48: programs are presented.  These programs
                     49: demonstrate in a simple way the use of pipes, socketpairs, sockets
                     50: and the use of datagram and stream communication.  The intent of this
                     51: document is to present a few simple example programs, not to describe the
                     52: networking system in full.
                     53: .)c
                     54: .sp 2
                     55: .(f
                     56: \(dg\|UNIX is a trademark of AT&T Bell Laboratories.
                     57: .)f
                     58: .b
                     59: .sh 1 "Goals"
                     60: .r
                     61: .pp
                     62: Facilities for interprocess communication (IPC) and networking
                     63: were a major addition to UNIX in the Berkeley UNIX 4.2BSD release.
                     64: These facilities required major additions and some changes
                     65: to the system interface.
                     66: The basic idea of this interface is to make IPC similar to file I/O.
                     67: In UNIX a process has a set of I/O descriptors, from which one reads
                     68: and to which one writes.
                     69: Descriptors may refer to normal files, to devices (including terminals),
                     70: or to communication channels.
                     71: The use of a descriptor has three phases: its creation,
                     72: its use for reading and writing, and its destruction.  By using descriptors
                     73: to write files, rather than simply naming the target file in the write
                     74: call, one gains a surprising amount of flexibility.  Often, the program that
                     75: creates a descriptor will be different from the program that uses the
                     76: descriptor.  For example the shell can create a descriptor for the output 
                     77: of the `ls'
                     78: command that will cause the listing to appear in a file rather than
                     79: on a terminal.
                     80: Pipes are another form of descriptor that have been used in UNIX
                     81: for some time.
                     82: Pipes allow one-way data transmission from one process
                     83: to another; the two processes and the pipe must be set up by a common
                     84: ancestor.
                     85: .pp
                     86: The use of descriptors is not the only communication interface
                     87: provided by UNIX.
                     88: The signal mechanism sends a tiny amount of information from one 
                     89: process to another.
                     90: The signaled process receives only the signal type,
                     91: not the identity of the sender,
                     92: and the number of possible signals is small.
                     93: The signal semantics limit the flexibility of the signaling mechanism
                     94: as a means of interprocess communication.
                     95: .pp
                     96: The identification of IPC with I/O is quite longstanding in UNIX and
                     97: has proved quite successful.  At first, however, IPC was limited to
                     98: processes communicating within a single machine.  With Berkeley UNIX
                     99: 4.2BSD this expanded to include IPC between machines.  This expansion
                    100: has necessitated some change in the way that descriptors are created.
                    101: Additionally, new possibilities for the meaning of read and write have
                    102: been admitted.  Originally the meanings, or semantics, of these terms
                    103: were fairly simple.  When you wrote something it was delivered.  When
                    104: you read something, you were blocked until the data arrived.
                    105: Other possibilities exist,
                    106: however.  One can write without full assurance of delivery if one can
                    107: check later to catch occasional failures.  Messages can be kept as
                    108: discrete units or merged into a stream. 
                    109: One can ask to read, but insist on not waiting if nothing is immediately
                    110: available.  These new possibilities are allowed in the Berkeley UNIX IPC
                    111: interface.  
                    112: .pp
                    113: Thus Berkeley UNIX 4.3BSD offers several choices for IPC.
                    114: This paper presents simple examples that illustrate some of
                    115: the choices.
                    116: The reader is presumed to be familiar with the C programming language
                    117: [Kernighan & Ritchie 1978],
                    118: but not necessarily with the system calls of the UNIX system or with
                    119: processes and interprocess communication.
                    120: The paper reviews the notion of a process and the types of
                    121: communication that are supported by Berkeley UNIX 4.3BSD.
                    122: A series of examples are presented that create processes that communicate
                    123: with one another.  The programs show different ways of establishing
                    124: channels of communication.
                    125: Finally, the calls that actually transfer data are reviewed.
                    126: To clearly present how communication can take place,
                    127: the example programs have been cleared of anything that
                    128: might be construed as useful work.
                    129: They can, therefore, serve as models
                    130: for the programmer trying to construct programs which are comprised of 
                    131: cooperating processes.
                    132: .b
                    133: .sh 1 "Processes"
                    134: .pp
                    135: A \fIprogram\fP is both a sequence of statements and a rough way of referring 
                    136: to the computation that occurs when the compiled statements are run.
                    137: A \fIprocess\fP can be thought of as a single line of control in a program.
                    138: Most programs execute some statements, go through a few loops, branch in
                    139: various directions and then end.  These are single process programs.
                    140: Programs can also have a point where control splits into two independent lines,
                    141: an action called \fIforking.\fP
                    142: In UNIX these lines can never join again.  A call to the system routine 
                    143: \fIfork()\fP, causes a process to split in this way.
                    144: The result of this call is that two independent processes will be
                    145: running, executing exactly the same code.
                    146: Memory values will be the same for all values set before the fork, but,
                    147: subsequently, each version will be able to change only the 
                    148: value of its own copy of each variable.
                    149: Initially, the only difference between the two will be the value returned by
                    150: \fIfork().\fP  The parent will receive a process id for the child, 
                    151: the child will receive a zero.
                    152: Calls to \fIfork(),\fP
                    153: therefore, typically precede, or are included in, an if-statement.
                    154: .pp
                    155: A process views the rest of the system through a private table of descriptors.
                    156: The descriptors can represent open files or sockets (sockets are communication
                    157: objects that will be discussed below).  Descriptors are referred to
                    158: by their index numbers in the table.  The first three descriptors are often
                    159: known by special names, \fI stdin, stdout\fP and \fIstderr\fP.
                    160: These are the standard input, output and error.
                    161: When a process forks, its descriptor table is copied to the child.
                    162: Thus, if the parent's standard input is being taken from a terminal
                    163: (devices are also treated as files in UNIX), the child's input will 
                    164: be taken from the
                    165: same terminal.  Whoever reads first will get the input.  If, before forking,
                    166: the parent changes its standard input so that it is reading from a
                    167: new file, the child will take its input from the new file.  It is
                    168: also possible to take input from a socket, rather than from a file.
                    169: .b
                    170: .sh 1 "Pipes"
                    171: .r
                    172: .pp
                    173: Most users of UNIX know that they can pipe the output of a 
                    174: program ``prog1'' to the input of another, ``prog2,'' by typing the command
                    175: \fI``prog1 | prog2.''\fP
                    176: This is called ``piping'' the output of one program
                    177: to another because the mechanism used to transfer the output is called a
                    178: pipe.
                    179: When the user types a command, the command is read by the shell, which
                    180: decides how to execute it.  If the command is simple, for example,
                    181: .i "``prog1,''"
                    182: the shell forks a process, which executes the program, prog1, and then dies.
                    183: The shell waits for this termination and then prompts for the next
                    184: command.
                    185: If the command is a compound command,
                    186: .i "``prog1 | prog2,''"
                    187: the shell creates two processes connected by a pipe. One process
                    188: runs the program, prog1, the other runs prog2.  The pipe is an I/O
                    189: mechanism with two ends, or sockets.  Data that is written into one socket
                    190: can be read from the other.  
                    191: .(z
                    192: .ft CW
                    193: .so pipe.c
                    194: .ft
                    195: .ce 1
                    196: Figure 1\ \ Use of a pipe
                    197: .)z
                    198: .pp
                    199: Since a program specifies its input and output only by the descriptor table
                    200: indices, which appear as variables or constants,
                    201: the input source and output destination can be changed without
                    202: changing the text of the program.
                    203: It is in this way that the shell is able to set up pipes.  Before executing
                    204: prog1, the process can close whatever is at \fIstdout\fP
                    205: and replace it with one
                    206: end of a pipe.  Similarly, the process that will execute prog2 can substitute
                    207: the opposite end of the pipe for 
                    208: \fIstdin.\fP
                    209: .pp
                    210: Let us now examine a program that creates a pipe for communication between
                    211: its child and itself (Figure 1).
                    212: A pipe is created by a parent process, which then forks.
                    213: When a process forks, the parent's descriptor table is copied into 
                    214: the child's.  
                    215: .pp
                    216: In Figure 1, the parent process makes a call to the system routine 
                    217: \fIpipe().\fP
                    218: This routine creates a pipe and places descriptors for the sockets
                    219: for the two ends of the pipe in the process's descriptor table.
                    220: \fIPipe()\fP
                    221: is passed an array into which it places the index numbers of the 
                    222: sockets it created.
                    223: The two ends are not equivalent.  The socket whose index is
                    224: returned in the low word of the array is opened for reading only,
                    225: while the socket in the high end is opened only for writing.
                    226: This corresponds to the fact that the standard input is the first
                    227: descriptor of a process's descriptor table and the standard output
                    228: is the second.  After creating the pipe, the parent creates the child 
                    229: with which it will share the pipe by calling \fIfork().\fP
                    230: Figure 2 illustrates the effect of a fork.
                    231: The parent process's descriptor table points to both ends of the pipe.
                    232: After the fork, both parent's and child's descriptor tables point to
                    233: the pipe.
                    234: The child can then use the pipe to send a message to the parent.
                    235: .(z
                    236: -
                    237: .bl 5.8i
                    238: -
                    239: .\" pipe.grn goes here
                    240: .sp
                    241: .ce 1
                    242: Figure 2\ \ Sharing a pipe between parent and child
                    243: .)z
                    244: .pp
                    245: Just what is a pipe?
                    246: It is a one-way communication mechanism, with one end opened
                    247: for reading and the other end for writing.
                    248: Therefore, parent and child need to agree on which way to turn
                    249: the pipe, from parent to child or the other way around.
                    250: Using the same pipe for communication both from parent to child and 
                    251: from child to parent would be possible (since both processes have 
                    252: references to both ends), but very complicated.
                    253: If the parent and child are to have a two-way conversation,
                    254: the parent creates two pipes, one for use in each direction.
                    255: (In accordance with their plans, both parent and child in the example above
                    256: close the socket that they will not use.  It is not required that unused
                    257: descriptors be closed, but it is good practice.)
                    258: A pipe is also a \fIstream\fP communication mechanism; that
                    259: is, all messages sent through the pipe are placed in order
                    260: and reliably delivered.  When the reader asks for a certain
                    261: number of bytes from this
                    262: stream, he is given as many bytes as are available, up
                    263: to the amount of the request. Note that these bytes may have come from 
                    264: the same call to \fIwrite()\fR or from several calls to \fIwrite()\fR
                    265: which were concatenated.
                    266: .b
                    267: .sh 1 "Socketpairs"
                    268: .r
                    269: .pp
                    270: Berkeley UNIX 4.3BSD provides a slight generalization of pipes.  A pipe is a
                    271: pair of connected sockets for one-way stream communication.  One may
                    272: obtain a pair of connected sockets for two-way stream communication
                    273: by calling the routine \fIsocketpair().\fP
                    274: The program in Figure 3 calls \fIsocketpair()\fP
                    275: to create such a connection.  The program uses the link for
                    276: communication in both directions.  Since socketpairs are
                    277: an extension of pipes, their use resembles that of pipes. 
                    278: Figure 4 illustrates the result of a fork following a call to 
                    279: \fIsocketpair().\fP
                    280: .pp
                    281: \fISocketpair()\fP
                    282: takes as
                    283: arguments a specification of a domain, a style of communication, and a
                    284: protocol.  
                    285: These are the parameters shown in the example.
                    286: Domains and protocols will be discussed in the next section.
                    287: Briefly,
                    288: a domain is a space of names that may be bound
                    289: to sockets and implies certain other conventions.
                    290: Currently, socketpairs have only been implemented for one
                    291: domain, called the UNIX domain.
                    292: The UNIX domain uses UNIX path names for naming sockets.  
                    293: It only allows communication
                    294: between sockets on the same machine.
                    295: .pp
                    296: Note that the header files 
                    297: .i "<sys/socket.h>"
                    298: and
                    299: .i "<sys/types.h>."
                    300: are required in this program.
                    301: The constants AF_UNIX and SOCK_STREAM are defined in 
                    302: .i "<sys/socket.h>,"
                    303: which in turn requires the file 
                    304: .i "<sys/types.h>"
                    305: for some of its definitions.
                    306: .(z
                    307: .ft CW
                    308: .so socketpair.c
                    309: .ft
                    310: .ce 1
                    311: Figure 3\ \ Use of a socketpair
                    312: .)z
                    313: .(z
                    314: -
                    315: .bl 5.8i
                    316: -
                    317: .\" socketpair.grn goes here
                    318: .sp
                    319: .ce 1
                    320: Figure 4\ \ Sharing a socketpair between parent and child
                    321: .)z
                    322: .b
                    323: .sh 1 "Domains and Protocols"
                    324: .r
                    325: .pp
                    326: Pipes and socketpairs are a simple solution for communicating between
                    327: a parent and child or between child processes.
                    328: What if we wanted to have processes that have no common ancestor
                    329: with whom to set up communication?
                    330: Neither standard UNIX pipes nor socketpairs are
                    331: the answer here, since both mechanisms require a common ancestor to
                    332: set up the communication.
                    333: We would like to have two processes separately create sockets
                    334: and then have messages sent between them.  This is often the
                    335: case when providing or using a service in the system.  This is
                    336: also the case when the communicating processes are on separate machines.
                    337: In Berkeley UNIX 4.3BSD one can create individual sockets, give them names and
                    338: send messages between them.
                    339: .pp
                    340: Sockets created by different programs use names to refer to one another;
                    341: names generally must be translated into addresses for use.
                    342: The space from which an address is drawn is referred to as a
                    343: .i domain.
                    344: There are several domains for sockets.
                    345: Two that will be used in the examples here are the UNIX domain (or AF_UNIX,
                    346: for Address Format UNIX) and the Internet domain (or AF_INET).
                    347: UNIX domain IPC is an experimental facility in 4.2BSD and 4.3BSD.
                    348: In the UNIX domain, a socket is given a path name within the file system
                    349: name space.
                    350: A file system node is created for the socket and other processes may 
                    351: then refer to the socket by giving the proper pathname.
                    352: UNIX domain names, therefore, allow communication between any two processes
                    353: that work in the same file system.
                    354: The Internet domain is the UNIX implementation of the DARPA Internet
                    355: standard protocols IP/TCP/UDP.
                    356: Addresses in the Internet domain consist of a machine network address
                    357: and an identifying number, called a port.
                    358: Internet domain names allow communication between machines.
                    359: .pp
                    360: Communication follows some particular ``style.''
                    361: Currently, communication is either through a \fIstream\fP
                    362: or by \fIdatagram.\fP
                    363: Stream communication implies several things.  Communication takes
                    364: place across a connection between two sockets.  The communication
                    365: is reliable, error-free, and, as in pipes, no message boundaries are
                    366: kept. Reading from a stream may result in reading the data sent from
                    367: one or several calls to \fIwrite()\fP
                    368: or only part of the data from a single call, if there is not enough room
                    369: for the entire message, or if not all the data from a large message
                    370: has been transferred.
                    371: The protocol implementing such a style will retransmit messages
                    372: received with errors. It will also return error messages if one tries to
                    373: send a message after the connection has been broken.
                    374: Datagram communication does not use connections.  Each message is
                    375: addressed individually.  If the address is correct, it will generally
                    376: be received, although this is not guaranteed.  Often datagrams are
                    377: used for requests that require a response from the 
                    378: recipient.  If no response
                    379: arrives in a reasonable amount of time, the request is repeated.
                    380: The individual datagrams will be kept separate when they are read, that
                    381: is, message boundaries are preserved.
                    382: .pp
                    383: The difference in performance between the two styles of communication is 
                    384: generally less important than the difference in semantics.  The
                    385: performance gain that one might find in using datagrams must be weighed
                    386: against the increased complexity of the program, which must now concern
                    387: itself with lost or out of order messages.  If lost messages may simply be 
                    388: ignored, the quantity of traffic may be a consideration. The expense
                    389: of setting up a connection is best justified by frequent use of the connection.
                    390: Since the performance of a protocol changes as it is tuned for different
                    391: situations, it is best to seek the most up-to-date information when
                    392: making choices for a program in which performance is crucial.
                    393: .pp
                    394: A protocol is a set of rules, data formats and conventions that regulate the
                    395: transfer of data between participants in the communication.
                    396: In general, there is one protocol for each socket type (stream,
                    397: datagram, etc.) within each domain.
                    398: The code that implements a protocol 
                    399: keeps track of the names that are bound to sockets,
                    400: sets up connections and        transfers data between sockets,
                    401: perhaps sending the data across a network.
                    402: This code also keeps track of the names that are bound to sockets.
                    403: It is possible for several protocols, differing only in low level
                    404: details, to implement the same style of communication within
                    405: a particular domain.  Although it is possible to select
                    406: which protocol should be used, for nearly all uses it is sufficient to
                    407: request the default protocol.  This has been done in all of the example
                    408: programs.
                    409: .pp
                    410: One specifies the domain, style and protocol of a socket when
                    411: it is created.  For example, in Figure 5a the call to \fIsocket()\fP
                    412: causes the creation of a datagram socket with the default protocol 
                    413: in the UNIX domain.
                    414: .b
                    415: .sh 1 "Datagrams in the UNIX Domain"
                    416: .r
                    417: .(z
                    418: .ft CW
                    419: .so udgramread.c
                    420: .ft
                    421: .ce 1
                    422: Figure 5a\ \ Reading UNIX domain datagrams
                    423: .)z
                    424: .pp
                    425: Let us now look at two programs that create sockets separately.
                    426: The programs in Figures 5a and 5b use datagram communication
                    427: rather than a stream.  
                    428: The structure used to name UNIX domain sockets is defined
                    429: in the file \fI<sys/un.h>.\fP
                    430: The definition has also been included in the example for clarity.
                    431: .pp
                    432: Each program creates a socket with a call to \fIsocket().\fP
                    433: These sockets are in the UNIX domain.
                    434: Once a name has been decided upon it is attached to a socket by the
                    435: system call \fIbind().\fP
                    436: The program in Figure 5a uses the name ``socket'',
                    437: which it binds to its socket.
                    438: This name will appear in the working directory of the program.
                    439: The routines in Figure 5b use its
                    440: socket only for sending messages.  It does not create a name for
                    441: the socket because no other process has to refer to it.  
                    442: .(z
                    443: .ft CW
                    444: .so udgramsend.c
                    445: .ft
                    446: .ce 1
                    447: Figure 5b\ \ Sending a UNIX domain datagrams
                    448: .)z
                    449: .pp
                    450: Names in the UNIX domain are path names.  Like file path names they may
                    451: be either absolute (e.g. ``/dev/imaginary'') or relative (e.g. ``socket'').
                    452: Because these names are used to allow processes to rendezvous, relative
                    453: path names can pose difficulties and should be used with care.
                    454: When a name is bound into the name space, a file (inode) is allocated in the
                    455: file system.  If
                    456: the inode is not deallocated, the name will continue to exist even after
                    457: the bound socket is closed.  This can cause subsequent runs of a program
                    458: to find that a name is unavailable, and can cause 
                    459: directories to fill up with these
                    460: objects.  The names are removed by calling \fIunlink()\fP or using
                    461: the \fIrm\fP\|(1) command.
                    462: Names in the UNIX domain are only used for rendezvous.  They are not used
                    463: for message delivery once a connection is established.  Therefore, in
                    464: contrast with the Internet domain, unbound sockets need not be (and are
                    465: not) automatically given addresses when they are connected.  
                    466: .pp
                    467: There is no established means of communicating names to interested
                    468: parties.  In the example, the program in Figure 5b gets the
                    469: name of the socket to which it will send its message through its
                    470: command line arguments.  Once a line of communication has been created,
                    471: one can send the names of additional, perhaps new, sockets over the link.
                    472: Facilities will have to be built that will make the distribution of
                    473: names less of a problem than it now is.
                    474: .b
                    475: .sh 1 "Datagrams in the Internet Domain"
                    476: .r
                    477: .(z
                    478: .ft CW
                    479: .so dgramread.c
                    480: .ft
                    481: .ce 1
                    482: Figure 6a\ \ Reading Internet domain datagrams
                    483: .)z
                    484: .pp
                    485: The examples in Figure 6a and 6b are very close to the previous example
                    486: except that the socket is in the Internet domain.
                    487: The structure of Internet domain addresses is defined in the file
                    488: \fI<netinet/in.h>\fP.
                    489: Internet addresses specify a host address (a 32-bit number)
                    490: and a delivery slot, or port, on that
                    491: machine.  These ports are managed by the system routines that implement 
                    492: a particular protocol.
                    493: Unlike UNIX domain names, Internet socket names are not entered into 
                    494: the file system and, therefore,
                    495: they do not have to be unlinked after the socket has been closed.
                    496: When a message must be sent between machines it is sent to
                    497: the protocol routine on the destination machine, which interprets the
                    498: address to determine to which socket the message should be delivered.
                    499: Several different protocols may be active on 
                    500: the same machine, but, in general, they will not communicate with one another.
                    501: As a result, different protocols are allowed to use the same port numbers.
                    502: Thus, implicitly, an Internet address is a triple including a protocol as
                    503: well as the port and machine address.
                    504: An \fIassociation\fP is a temporary or permanent specification
                    505: of a pair of communicating sockets.
                    506: An association is thus identified by the tuple
                    507: <\fIprotocol, local machine address, local port,
                    508: remote machine address, remote port\fP>.
                    509: An association may be transient when using datagram sockets;
                    510: the association actually exists during a \fIsend\fP operation.
                    511: .(z
                    512: .ft CW
                    513: .so dgramsend.c
                    514: .ft
                    515: .ce 1
                    516: Figure 6b\ \ Sending an Internet domain datagram
                    517: .)z
                    518: .pp
                    519: The protocol for a socket is chosen when the socket is created.  The 
                    520: local machine address for a socket can be any valid network address of the
                    521: machine, if it has more than one, or it can be the wildcard value
                    522: INADDR_ANY.
                    523: The wildcard value is used in the program in Figure 6a.
                    524: If a machine has several network addresses, it is likely
                    525: that messages sent to any of the addresses should be deliverable to
                    526: a socket.  This will be the case if the wildcard value has been chosen.
                    527: Note that even if the wildcard value is chosen, a program sending messages
                    528: to the named socket must specify a valid network address.  One can be willing
                    529: to receive from ``anywhere,'' but one cannot send a message ``anywhere.''
                    530: The program in Figure 6b is given the destination host name as a command
                    531: line argument.
                    532: To determine a network address to which it can send the message, it looks
                    533: up
                    534: the host address by the call to \fIgethostbyname()\fP.
                    535: The returned structure includes the host's network address,
                    536: which is copied into the structure specifying the
                    537: destination of the message.
                    538: .pp
                    539: The port number can be thought of as the number of a mailbox, into
                    540: which the protocol places one's messages.  Certain daemons, offering
                    541: certain advertised services, have reserved
                    542: or ``well-known'' port numbers.  These fall in the range
                    543: from 1 to 1023.  Higher numbers are available to general users.
                    544: Only servers need to ask for a particular number.
                    545: The system will assign an unused port number when an address
                    546: is bound to a socket.
                    547: This may happen when an explicit \fIbind\fP
                    548: call is made with a port number of 0, or
                    549: when a \fIconnect\fP or \fIsend\fP
                    550: is performed on an unbound socket.
                    551: Note that port numbers are not automatically reported back to the user.
                    552: After calling \fIbind(),\fP asking for port 0, one may call 
                    553: \fIgetsockname()\fP to discover what port was actually assigned. 
                    554: The routine \fIgetsockname()\fP
                    555: will not work for names in the UNIX domain.
                    556: .pp
                    557: The format of the socket address is specified in part by standards within the
                    558: Internet domain.  The specification includes the order of the bytes in
                    559: the address.  Because machines differ in the internal representation
                    560: they ordinarily use
                    561: to represent integers, printing out the port number as returned by 
                    562: \fIgetsockname()\fP may result in a misinterpretation.  To
                    563: print out the number, it is necessary to use the routine \fIntohs()\fP
                    564: (for \fInetwork to host: short\fP) to convert the number from the
                    565: network representation to the host's representation.  On some machines,
                    566: such as 68000-based machines, this is a null operation.  On others,
                    567: such as VAXes, this results in a swapping of bytes.  Another routine
                    568: exists to convert a short integer from the host format to the network format,
                    569: called \fIhtons()\fP; similar routines exist for long integers.
                    570: For further information, refer to the
                    571: entry for \fIbyteorder\fP in section 3 of the manual.
                    572: .b
                    573: .sh 1 "Connections"
                    574: .r
                    575: .pp
                    576: To send data between stream sockets (having communication style SOCK_STREAM),
                    577: the sockets must be connected.
                    578: Figures 7a and 7b show two programs that create such a connection.
                    579: The program in 7a is relatively simple.
                    580: To initiate a connection, this program simply creates
                    581: a stream socket, then calls \fIconnect()\fP,
                    582: specifying the address of the socket to which
                    583: it wishes its socket connected.  Provided that the target socket exists and
                    584: is prepared to handle a connection, connection will be complete,
                    585: and the program can begin to send
                    586: messages.  Messages will be delivered in order without message
                    587: boundaries, as with pipes.  The connection is destroyed when either
                    588: socket is closed (or soon thereafter).  If a process persists 
                    589: in sending messages after the connection is closed, a SIGPIPE signal 
                    590: is sent to the process by the operating system.  Unless explicit action
                    591: is taken to handle the signal (see the manual page for \fIsignal\fP
                    592: or \fIsigvec\fP),
                    593: the process will terminate and the shell
                    594: will print the message ``broken pipe.'' 
                    595: .(z
                    596: .ft CW
                    597: .so streamwrite.c
                    598: .ft
                    599: .ce 1
                    600: Figure 7a\ \ Initiating an Internet domain stream connection
                    601: .)z
                    602: .(z
                    603: .ft CW
                    604: .so streamread.c
                    605: .ft
                    606: .ce 1
                    607: Figure 7b\ \ Accepting an Internet domain stream connection
                    608: .sp 2
                    609: .ft CW
                    610: .so strchkread.c
                    611: .ft
                    612: .ce 1
                    613: Figure 7c\ \ Using select() to check for pending connections
                    614: .)z
                    615: .(z
                    616: -
                    617: .bl 5.8i
                    618: -
                    619: .\" accept.grn goes here
                    620: .sp
                    621: .ce 1
                    622: Figure 8\ \ Establishing a stream connection
                    623: .)z
                    624: .pp
                    625: Forming a connection is asymmetrical; one process, such as the
                    626: program in Figure 7a, requests a connection with a particular socket,
                    627: the other process accepts connection requests.
                    628: Before a connection can be accepted a socket must be created and an address
                    629: bound to it.  This
                    630: situation is illustrated in the top half of Figure 8.  Process 2
                    631: has created a socket and bound a port number to it.  Process 1 has created an
                    632: unnamed socket.
                    633: The address bound to process 2's socket is then made known to process 1 and, 
                    634: perhaps to several other potential communicants as well.
                    635: If there are several possible communicants,
                    636: this one socket might receive several requests for connections.
                    637: As a result, a new socket is created for each connection.  This new socket
                    638: is the endpoint for communication within this process for this connection.
                    639: A connection may be destroyed by closing the corresponding socket.
                    640: .pp
                    641: The program in Figure 7b is a rather trivial example of a server.  It 
                    642: creates a socket to which it binds a name, which it then advertises.
                    643: (In this case it prints out the socket number.)  The program then calls
                    644: \fIlisten()\fP for this socket.  
                    645: Since several clients may attempt to connect more or less
                    646: simultaneously, a queue of pending connections is maintained in the system
                    647: address space.  \fIListen()\fP
                    648: marks the socket as willing to accept connections and initializes the queue.
                    649: When a connection is requested, it is listed in the queue.  If the
                    650: queue is full, an error status may be returned to the requester.
                    651: The maximum length of this queue is specified by the second argument of
                    652: \fIlisten()\fP; the maximum length is limited by the system.  
                    653: Once the listen call has been completed, the program enters
                    654: an infinite loop.  On each pass through the loop, a new connection is
                    655: accepted and removed from the queue, and, hence, a new socket for the 
                    656: connection is created.  The bottom half of Figure 8 shows the result of
                    657: Process 1 connecting with the named socket of Process 2, and Process 2
                    658: accepting the connection.  After the connection is created, the
                    659: service, in this case printing out the messages, is performed and the
                    660: connection socket closed.  The \fIaccept()\fP
                    661: call will take a pending connection
                    662: request from the queue if one is available, or block waiting for a request.
                    663: Messages are read from the connection socket.
                    664: Reads from an active connection will normally block until data is available.
                    665: The number of bytes read is returned.  When a connection is destroyed,
                    666: the read call returns immediately.  The number of bytes returned will
                    667: be zero.
                    668: .pp
                    669: The program in Figure 7c is a slight variation on the server in Figure 7b.
                    670: It avoids blocking when there are no pending connection requests by 
                    671: calling \fIselect()\fP
                    672: to check for pending requests before calling \fIaccept().\fP
                    673: This strategy is useful when connections may be received
                    674: on more than one socket, or when data may arrive on other connected
                    675: sockets before another connection request.
                    676: .pp
                    677: The programs in Figures 9a and 9b show a program using stream communication
                    678: in the UNIX domain.  Streams in the UNIX domain can be used for this sort
                    679: of program in exactly the same way as Internet domain streams, except for
                    680: the form of the names and the restriction of the connections to a single
                    681: file system.  There are some differences, however, in the functionality of 
                    682: streams in the two domains, notably in the handling of 
                    683: \fIout-of-band\fP data (discussed briefly below).  These differences
                    684: are beyond the scope of this paper.
                    685: .(z
                    686: .ft CW
                    687: .so ustreamwrite.c
                    688: .ft
                    689: .ce 1
                    690: Figure 9a\ \ Initiating a UNIX domain stream connection
                    691: .sp 2
                    692: .ft CW
                    693: .so ustreamread.c
                    694: .ft
                    695: .ce 1
                    696: Figure 9b\ \ Accepting a UNIX domain stream connection
                    697: .)z
                    698: .b
                    699: .sh 1 "Reads, Writes, Recvs, etc."
                    700: .r
                    701: .pp
                    702: UNIX 4.3BSD has several system calls for reading and writing information.
                    703: The simplest calls are \fIread() \fP and \fIwrite().\fP \fIWrite()\fP
                    704: takes as arguments the index of a descriptor, a pointer to a buffer 
                    705: containing the data and the size of the data.
                    706: The descriptor may indicate either a file or a connected socket.  
                    707: ``Connected'' can mean either a connected stream socket (as described
                    708: in Section 8) or a datagram socket for which a \fIconnect()\fP
                    709: call has provided a default destination (see the \fIconnect()\fP manual page).
                    710: \fIRead()\fP also takes a descriptor that indicates either a file or a socket.
                    711: \fIWrite()\fP requires a connected socket since no destination is 
                    712: specified in the parameters of the system call.
                    713: \fIRead()\fP can be used for either a connected or an unconnected socket.
                    714: These calls are, therefore, quite flexible and may be used to
                    715: write applications that require no assumptions about the source of
                    716: their input or the destination of their output.
                    717: There are variations on \fIread() \fP and \fIwrite()\fP
                    718: that allow the source and destination of the input and output to use
                    719: several separate buffers, while retaining the flexibility to handle
                    720: both files and sockets.  These are \fIreadv()\fP and \fI writev(),\fP
                    721: for read and write \fIvector.\fP
                    722: .pp 
                    723: It is sometimes necessary to send high priority data over a
                    724: connection that may have unread low priority data at the
                    725: other end.  For example, a user interface process may be interpreting
                    726: commands and sending them on to another process through a stream connection.
                    727: The user interface may have filled the stream with as yet unprocessed 
                    728: requests when the user types
                    729: a command to cancel all outstanding requests.
                    730: Rather than have the high priority data wait
                    731: to be processed after the low priority data, it is possible to
                    732: send it as \fIout-of-band\fP
                    733: (OOB) data.  The notification of pending OOB data results in the generation of
                    734: a SIGURG signal, if this signal has been enabled (see the manual
                    735: page for \fIsignal\fP or \fIsigvec\fP).
                    736: See [Leffler 1986] for a more complete description of the OOB mechanism.
                    737: There are a pair of calls similar to \fIread\fP and \fIwrite\fP
                    738: that allow options, including sending 
                    739: and receiving OOB information; these are \fI send()\fP
                    740: and \fIrecv().\fP
                    741: These calls are used only with sockets; specifying a descriptor for a file will
                    742: result in the return of an error status.  These calls also allow
                    743: \fIpeeking\fP at data in a stream.
                    744: That is, they allow a process to read data without removing the data from
                    745: the stream.  One use of this facility is to read ahead in a stream
                    746: to determine the size of the next item to be read.
                    747: When not using these options, these calls have the same functions as 
                    748: \fIread()\fP and \fIwrite().\fP
                    749: .pp
                    750: To send datagrams, one must be allowed to specify the destination.
                    751: The call \fIsendto()\fP
                    752: takes a destination address as an argument and is therefore used for
                    753: sending datagrams.  The call \fIrecvfrom()\fP
                    754: is often used to read datagrams, since this call returns the address
                    755: of the sender, if it is available, along with the data.
                    756: If the identity of the sender does not matter, one may use \fIread()\fP
                    757: or \fIrecv().\fP
                    758: .pp
                    759: Finally, there are a pair of calls that allow the sending and
                    760: receiving of messages from multiple buffers, when the address of the
                    761: recipient must be specified.  These are \fIsendmsg()\fP and 
                    762: \fIrecvmsg().\fP
                    763: These calls are actually quite general and have other uses,
                    764: including, in the UNIX domain, the transmission of a file descriptor from one
                    765: process to another.
                    766: .pp
                    767: The various options for reading and writing are shown in Figure 10,
                    768: together with their parameters.  The parameters for each system call
                    769: reflect the differences in function of the different calls.
                    770: In the examples given in this paper, the calls \fIread()\fP and 
                    771: \fIwrite()\fP have been used whenever possible.
                    772: .(z
                    773: .ft CW
                    774:        /*
                    775:         * The variable descriptor may be the descriptor of either a file
                    776:         * or of a socket.
                    777:         */
                    778:        cc = read(descriptor, buf, nbytes)
                    779:        int cc, descriptor; char *buf; int nbytes;
                    780: 
                    781:        /*
                    782:         * An iovec can include several source buffers.
                    783:         */
                    784:        cc = readv(descriptor, iov, iovcnt)
                    785:        int cc, descriptor; struct iovec *iov; int iovcnt;
                    786: 
                    787:        cc = write(descriptor, buf, nbytes)
                    788:        int cc, descriptor; char *buf; int nbytes;
                    789: 
                    790:        cc = writev(descriptor, iovec, ioveclen)
                    791:        int cc, descriptor; struct iovec *iovec; int ioveclen;
                    792: 
                    793:        /*
                    794:         * The variable ``sock'' must be the descriptor of a socket.
                    795:         * Flags may include MSG_OOB and MSG_PEEK.
                    796:         */
                    797:        cc = send(sock, msg, len, flags)
                    798:        int cc, sock; char *msg; int len, flags; 
                    799: 
                    800:        cc = sendto(sock, msg, len, flags, to, tolen)
                    801:        int cc, sock; char *msg; int len, flags;
                    802:        struct sockaddr *to; int tolen;
                    803: 
                    804:        cc = sendmsg(sock, msg, flags)
                    805:        int cc, sock; struct msghdr msg[]; int flags;
                    806: 
                    807:        cc = recv(sock, buf, len, flags)
                    808:        int cc, sock; char *buf; int len, flags;
                    809: 
                    810:        cc = recvfrom(sock, buf, len, flags, from, fromlen)
                    811:        int cc, sock; char *buf; int len, flags;
                    812:        struct sockaddr *from; int *fromlen;
                    813: 
                    814:        cc = recvmsg(sock, msg, flags)
                    815:        int cc, socket; struct msghdr msg[]; int flags;
                    816: .ft
                    817: .sp 1
                    818: .ce 1
                    819: Figure 10\ \ Varieties of read and write commands
                    820: .)z
                    821: .b
                    822: .sh 1 "Choices"
                    823: .r
                    824: .pp
                    825: This paper has presented examples of some of the forms
                    826: of communication supported by
                    827: Berkeley UNIX 4.3BSD.  These have been presented in an order chosen for
                    828: ease of presentation.  It is useful to review these options emphasizing the
                    829: factors that make each attractive.
                    830: .pp
                    831: Pipes have the advantage of portability, in that they are supported in all
                    832: UNIX systems.  They also are relatively
                    833: simple to use.  Socketpairs share this simplicity and have the additional
                    834: advantage of allowing bidirectional communication.  The major shortcoming
                    835: of these mechanisms is that they require communicating processes to be
                    836: descendants of a common process.  They do not allow intermachine communication.
                    837: .pp
                    838: The two communication domains, UNIX and Internet, allow processes with no common
                    839: ancestor to communicate.
                    840: Of the two, only the Internet domain allows
                    841: communication between machines.
                    842: This makes the Internet domain a necessary
                    843: choice for processes running on separate machines.
                    844: .pp
                    845: The choice between datagrams and stream communication is best made by
                    846: carefully considering the semantic and performance
                    847: requirements of the application.
                    848: Streams can be both advantageous and disadvantageous.  One disadvantage
                    849: is that a process is only allowed a limited number of open streams,
                    850: as there are usually only 64 entries available in the open descriptor
                    851: table.  This can cause problems if a single server must talk with a large
                    852: number of clients. 
                    853: Another is that for delivering a short message the stream setup and 
                    854: teardown time can be unnecessarily long.  Weighed against this are
                    855: the reliability built into the streams.  This will often be the
                    856: deciding factor in favor of streams.
                    857: .b
                    858: .sh 1 "What to do Next"
                    859: .r
                    860: .pp
                    861: Many of the examples presented here can serve as models for multiprocess
                    862: programs and for programs distributed across several machines.  
                    863: In developing a new multiprocess program, it is often easiest to 
                    864: first write the code to create the processes and communication paths.
                    865: After this code is debugged, the code specific to the application can
                    866: be added.
                    867: .pp
                    868: An introduction to the UNIX system and programming using UNIX system calls
                    869: can be found in [Kernighan and Pike 1984].
                    870: Further documentation of the Berkeley UNIX 4.3BSD IPC mechanisms can be 
                    871: found in [Leffler et al. 1986].
                    872: More detailed information about particular calls and protocols
                    873: is provided in sections
                    874: 2, 3 and 4 of the
                    875: UNIX Programmer's Manual [CSRG 1986].
                    876: In particular the following manual pages are relevant:
                    877: .(b
                    878: .TS
                    879: l l.
                    880: creating and naming sockets    socket(2), bind(2)
                    881: establishing connections       listen(2), accept(2), connect(2)
                    882: transferring data      read(2), write(2), send(2), recv(2)
                    883: addresses      inet(4F)
                    884: protocols      tcp(4P), udp(4P).
                    885: .TE
                    886: .)b
                    887: .(b
                    888: .sp
                    889: .b
                    890: Acknowledgements
                    891: .pp
                    892: I would like to thank Sam Leffler and Mike Karels for their help in
                    893: understanding the IPC mechanisms and all the people whose comments 
                    894: have helped in writing and improving this report.
                    895: .pp
                    896: This work was sponsored by the Defense Advanced Research Projects Agency
                    897: (DoD), ARPA Order No. 4031, monitored by the Naval Electronics Systems
                    898: Command under contract No. N00039-C-0235. 
                    899: The views and conclusions contained in this document are those of the
                    900: author and should not be interpreted as representing official policies,
                    901: either expressed or implied, of the Defense Research Projects Agency
                    902: or of the US Government.
                    903: .)b
                    904: .(b
                    905: .sp
                    906: .b
                    907: References
                    908: .r
                    909: .sp
                    910: .ls 1
                    911: B.W. Kernighan & R. Pike, 1984,
                    912: .i "The UNIX Programming Environment."
                    913: Englewood Cliffs, N.J.: Prentice-Hall.
                    914: .sp
                    915: .ls 1
                    916: B.W. Kernighan & D.M. Ritchie, 1978,
                    917: .i "The C Programming Language,"
                    918: Englewood Cliffs, N.J.: Prentice-Hall.
                    919: .sp
                    920: .ls 1
                    921: S.J. Leffler, R.S. Fabry, W.N. Joy, P. Lapsley, S. Miller & C. Torek, 1986,
                    922: .i "An Advanced 4.3BSD Interprocess Communication Tutorial."
                    923: Computer Systems Research Group,
                    924: Department of Electrical Engineering and Computer Science,
                    925: University of California, Berkeley.
                    926: .sp
                    927: .ls 1
                    928: Computer Systems Research Group, 1986,
                    929: .i "UNIX Programmer's Manual, 4.3 Berkeley Software Distribution."
                    930: Computer Systems Research Group,
                    931: Department of Electrical Engineering and Computer Science,
                    932: University of California, Berkeley.
                    933: .)b

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.