|
|
1.1 root 1: .\" Copyright (c) 1986 The Regents of the University of California.
2: .\" All rights reserved.
3: .\"
4: .\" Redistribution and use in source and binary forms are permitted
5: .\" provided that the above copyright notice and this paragraph are
6: .\" duplicated in all such forms and that any documentation,
7: .\" advertising materials, and other materials related to such
8: .\" distribution and use acknowledge that the software was developed
9: .\" by the University of California, Berkeley. The name of the
10: .\" University may not be used to endorse or promote products derived
11: .\" from this software without specific prior written permission.
12: .\" THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
13: .\" IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
14: .\" WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
15: .\"
16: .\" @(#)tutor.me 6.6 (Berkeley) 3/7/89
17: .\"
18: .oh 'Introductory 4.3BSD IPC''PS1:7-%'
19: .eh 'PS1:7-%''Introductory 4.3BSD IPC'
20: .rs
21: .sp 2
22: .sz 14
23: .ft B
24: .ce 2
25: An Introductory 4.3BSD
26: Interprocess Communication Tutorial
27: .sz 10
28: .sp 2
29: .ce
30: .i "Stuart Sechrest"
31: .ft
32: .sp
33: .ce 4
34: Computer Science Research Group
35: Computer Science Division
36: Department of Electrical Engineering and Computer Science
37: University of California, Berkeley
38: .sp 2
39: .ce
40: .i ABSTRACT
41: .sp
42: .(c
43: .pp
44: Berkeley UNIX\(dg 4.3BSD offers several choices for interprocess communication.
45: To aid the programmer in developing programs which are comprised of
46: cooperating
47: processes, the different choices are discussed and a series of example
48: programs are presented. These programs
49: demonstrate in a simple way the use of pipes, socketpairs, sockets
50: and the use of datagram and stream communication. The intent of this
51: document is to present a few simple example programs, not to describe the
52: networking system in full.
53: .)c
54: .sp 2
55: .(f
56: \(dg\|UNIX is a trademark of AT&T Bell Laboratories.
57: .)f
58: .b
59: .sh 1 "Goals"
60: .r
61: .pp
62: Facilities for interprocess communication (IPC) and networking
63: were a major addition to UNIX in the Berkeley UNIX 4.2BSD release.
64: These facilities required major additions and some changes
65: to the system interface.
66: The basic idea of this interface is to make IPC similar to file I/O.
67: In UNIX a process has a set of I/O descriptors, from which one reads
68: and to which one writes.
69: Descriptors may refer to normal files, to devices (including terminals),
70: or to communication channels.
71: The use of a descriptor has three phases: its creation,
72: its use for reading and writing, and its destruction. By using descriptors
73: to write files, rather than simply naming the target file in the write
74: call, one gains a surprising amount of flexibility. Often, the program that
75: creates a descriptor will be different from the program that uses the
76: descriptor. For example the shell can create a descriptor for the output
77: of the `ls'
78: command that will cause the listing to appear in a file rather than
79: on a terminal.
80: Pipes are another form of descriptor that have been used in UNIX
81: for some time.
82: Pipes allow one-way data transmission from one process
83: to another; the two processes and the pipe must be set up by a common
84: ancestor.
85: .pp
86: The use of descriptors is not the only communication interface
87: provided by UNIX.
88: The signal mechanism sends a tiny amount of information from one
89: process to another.
90: The signaled process receives only the signal type,
91: not the identity of the sender,
92: and the number of possible signals is small.
93: The signal semantics limit the flexibility of the signaling mechanism
94: as a means of interprocess communication.
95: .pp
96: The identification of IPC with I/O is quite longstanding in UNIX and
97: has proved quite successful. At first, however, IPC was limited to
98: processes communicating within a single machine. With Berkeley UNIX
99: 4.2BSD this expanded to include IPC between machines. This expansion
100: has necessitated some change in the way that descriptors are created.
101: Additionally, new possibilities for the meaning of read and write have
102: been admitted. Originally the meanings, or semantics, of these terms
103: were fairly simple. When you wrote something it was delivered. When
104: you read something, you were blocked until the data arrived.
105: Other possibilities exist,
106: however. One can write without full assurance of delivery if one can
107: check later to catch occasional failures. Messages can be kept as
108: discrete units or merged into a stream.
109: One can ask to read, but insist on not waiting if nothing is immediately
110: available. These new possibilities are allowed in the Berkeley UNIX IPC
111: interface.
112: .pp
113: Thus Berkeley UNIX 4.3BSD offers several choices for IPC.
114: This paper presents simple examples that illustrate some of
115: the choices.
116: The reader is presumed to be familiar with the C programming language
117: [Kernighan & Ritchie 1978],
118: but not necessarily with the system calls of the UNIX system or with
119: processes and interprocess communication.
120: The paper reviews the notion of a process and the types of
121: communication that are supported by Berkeley UNIX 4.3BSD.
122: A series of examples are presented that create processes that communicate
123: with one another. The programs show different ways of establishing
124: channels of communication.
125: Finally, the calls that actually transfer data are reviewed.
126: To clearly present how communication can take place,
127: the example programs have been cleared of anything that
128: might be construed as useful work.
129: They can, therefore, serve as models
130: for the programmer trying to construct programs which are comprised of
131: cooperating processes.
132: .b
133: .sh 1 "Processes"
134: .pp
135: A \fIprogram\fP is both a sequence of statements and a rough way of referring
136: to the computation that occurs when the compiled statements are run.
137: A \fIprocess\fP can be thought of as a single line of control in a program.
138: Most programs execute some statements, go through a few loops, branch in
139: various directions and then end. These are single process programs.
140: Programs can also have a point where control splits into two independent lines,
141: an action called \fIforking.\fP
142: In UNIX these lines can never join again. A call to the system routine
143: \fIfork()\fP, causes a process to split in this way.
144: The result of this call is that two independent processes will be
145: running, executing exactly the same code.
146: Memory values will be the same for all values set before the fork, but,
147: subsequently, each version will be able to change only the
148: value of its own copy of each variable.
149: Initially, the only difference between the two will be the value returned by
150: \fIfork().\fP The parent will receive a process id for the child,
151: the child will receive a zero.
152: Calls to \fIfork(),\fP
153: therefore, typically precede, or are included in, an if-statement.
154: .pp
155: A process views the rest of the system through a private table of descriptors.
156: The descriptors can represent open files or sockets (sockets are communication
157: objects that will be discussed below). Descriptors are referred to
158: by their index numbers in the table. The first three descriptors are often
159: known by special names, \fI stdin, stdout\fP and \fIstderr\fP.
160: These are the standard input, output and error.
161: When a process forks, its descriptor table is copied to the child.
162: Thus, if the parent's standard input is being taken from a terminal
163: (devices are also treated as files in UNIX), the child's input will
164: be taken from the
165: same terminal. Whoever reads first will get the input. If, before forking,
166: the parent changes its standard input so that it is reading from a
167: new file, the child will take its input from the new file. It is
168: also possible to take input from a socket, rather than from a file.
169: .b
170: .sh 1 "Pipes"
171: .r
172: .pp
173: Most users of UNIX know that they can pipe the output of a
174: program ``prog1'' to the input of another, ``prog2,'' by typing the command
175: \fI``prog1 | prog2.''\fP
176: This is called ``piping'' the output of one program
177: to another because the mechanism used to transfer the output is called a
178: pipe.
179: When the user types a command, the command is read by the shell, which
180: decides how to execute it. If the command is simple, for example,
181: .i "``prog1,''"
182: the shell forks a process, which executes the program, prog1, and then dies.
183: The shell waits for this termination and then prompts for the next
184: command.
185: If the command is a compound command,
186: .i "``prog1 | prog2,''"
187: the shell creates two processes connected by a pipe. One process
188: runs the program, prog1, the other runs prog2. The pipe is an I/O
189: mechanism with two ends, or sockets. Data that is written into one socket
190: can be read from the other.
191: .(z
192: .ft CW
193: .so pipe.c
194: .ft
195: .ce 1
196: Figure 1\ \ Use of a pipe
197: .)z
198: .pp
199: Since a program specifies its input and output only by the descriptor table
200: indices, which appear as variables or constants,
201: the input source and output destination can be changed without
202: changing the text of the program.
203: It is in this way that the shell is able to set up pipes. Before executing
204: prog1, the process can close whatever is at \fIstdout\fP
205: and replace it with one
206: end of a pipe. Similarly, the process that will execute prog2 can substitute
207: the opposite end of the pipe for
208: \fIstdin.\fP
209: .pp
210: Let us now examine a program that creates a pipe for communication between
211: its child and itself (Figure 1).
212: A pipe is created by a parent process, which then forks.
213: When a process forks, the parent's descriptor table is copied into
214: the child's.
215: .pp
216: In Figure 1, the parent process makes a call to the system routine
217: \fIpipe().\fP
218: This routine creates a pipe and places descriptors for the sockets
219: for the two ends of the pipe in the process's descriptor table.
220: \fIPipe()\fP
221: is passed an array into which it places the index numbers of the
222: sockets it created.
223: The two ends are not equivalent. The socket whose index is
224: returned in the low word of the array is opened for reading only,
225: while the socket in the high end is opened only for writing.
226: This corresponds to the fact that the standard input is the first
227: descriptor of a process's descriptor table and the standard output
228: is the second. After creating the pipe, the parent creates the child
229: with which it will share the pipe by calling \fIfork().\fP
230: Figure 2 illustrates the effect of a fork.
231: The parent process's descriptor table points to both ends of the pipe.
232: After the fork, both parent's and child's descriptor tables point to
233: the pipe.
234: The child can then use the pipe to send a message to the parent.
235: .(z
236: -
237: .bl 5.8i
238: -
239: .\" pipe.grn goes here
240: .sp
241: .ce 1
242: Figure 2\ \ Sharing a pipe between parent and child
243: .)z
244: .pp
245: Just what is a pipe?
246: It is a one-way communication mechanism, with one end opened
247: for reading and the other end for writing.
248: Therefore, parent and child need to agree on which way to turn
249: the pipe, from parent to child or the other way around.
250: Using the same pipe for communication both from parent to child and
251: from child to parent would be possible (since both processes have
252: references to both ends), but very complicated.
253: If the parent and child are to have a two-way conversation,
254: the parent creates two pipes, one for use in each direction.
255: (In accordance with their plans, both parent and child in the example above
256: close the socket that they will not use. It is not required that unused
257: descriptors be closed, but it is good practice.)
258: A pipe is also a \fIstream\fP communication mechanism; that
259: is, all messages sent through the pipe are placed in order
260: and reliably delivered. When the reader asks for a certain
261: number of bytes from this
262: stream, he is given as many bytes as are available, up
263: to the amount of the request. Note that these bytes may have come from
264: the same call to \fIwrite()\fR or from several calls to \fIwrite()\fR
265: which were concatenated.
266: .b
267: .sh 1 "Socketpairs"
268: .r
269: .pp
270: Berkeley UNIX 4.3BSD provides a slight generalization of pipes. A pipe is a
271: pair of connected sockets for one-way stream communication. One may
272: obtain a pair of connected sockets for two-way stream communication
273: by calling the routine \fIsocketpair().\fP
274: The program in Figure 3 calls \fIsocketpair()\fP
275: to create such a connection. The program uses the link for
276: communication in both directions. Since socketpairs are
277: an extension of pipes, their use resembles that of pipes.
278: Figure 4 illustrates the result of a fork following a call to
279: \fIsocketpair().\fP
280: .pp
281: \fISocketpair()\fP
282: takes as
283: arguments a specification of a domain, a style of communication, and a
284: protocol.
285: These are the parameters shown in the example.
286: Domains and protocols will be discussed in the next section.
287: Briefly,
288: a domain is a space of names that may be bound
289: to sockets and implies certain other conventions.
290: Currently, socketpairs have only been implemented for one
291: domain, called the UNIX domain.
292: The UNIX domain uses UNIX path names for naming sockets.
293: It only allows communication
294: between sockets on the same machine.
295: .pp
296: Note that the header files
297: .i "<sys/socket.h>"
298: and
299: .i "<sys/types.h>."
300: are required in this program.
301: The constants AF_UNIX and SOCK_STREAM are defined in
302: .i "<sys/socket.h>,"
303: which in turn requires the file
304: .i "<sys/types.h>"
305: for some of its definitions.
306: .(z
307: .ft CW
308: .so socketpair.c
309: .ft
310: .ce 1
311: Figure 3\ \ Use of a socketpair
312: .)z
313: .(z
314: -
315: .bl 5.8i
316: -
317: .\" socketpair.grn goes here
318: .sp
319: .ce 1
320: Figure 4\ \ Sharing a socketpair between parent and child
321: .)z
322: .b
323: .sh 1 "Domains and Protocols"
324: .r
325: .pp
326: Pipes and socketpairs are a simple solution for communicating between
327: a parent and child or between child processes.
328: What if we wanted to have processes that have no common ancestor
329: with whom to set up communication?
330: Neither standard UNIX pipes nor socketpairs are
331: the answer here, since both mechanisms require a common ancestor to
332: set up the communication.
333: We would like to have two processes separately create sockets
334: and then have messages sent between them. This is often the
335: case when providing or using a service in the system. This is
336: also the case when the communicating processes are on separate machines.
337: In Berkeley UNIX 4.3BSD one can create individual sockets, give them names and
338: send messages between them.
339: .pp
340: Sockets created by different programs use names to refer to one another;
341: names generally must be translated into addresses for use.
342: The space from which an address is drawn is referred to as a
343: .i domain.
344: There are several domains for sockets.
345: Two that will be used in the examples here are the UNIX domain (or AF_UNIX,
346: for Address Format UNIX) and the Internet domain (or AF_INET).
347: UNIX domain IPC is an experimental facility in 4.2BSD and 4.3BSD.
348: In the UNIX domain, a socket is given a path name within the file system
349: name space.
350: A file system node is created for the socket and other processes may
351: then refer to the socket by giving the proper pathname.
352: UNIX domain names, therefore, allow communication between any two processes
353: that work in the same file system.
354: The Internet domain is the UNIX implementation of the DARPA Internet
355: standard protocols IP/TCP/UDP.
356: Addresses in the Internet domain consist of a machine network address
357: and an identifying number, called a port.
358: Internet domain names allow communication between machines.
359: .pp
360: Communication follows some particular ``style.''
361: Currently, communication is either through a \fIstream\fP
362: or by \fIdatagram.\fP
363: Stream communication implies several things. Communication takes
364: place across a connection between two sockets. The communication
365: is reliable, error-free, and, as in pipes, no message boundaries are
366: kept. Reading from a stream may result in reading the data sent from
367: one or several calls to \fIwrite()\fP
368: or only part of the data from a single call, if there is not enough room
369: for the entire message, or if not all the data from a large message
370: has been transferred.
371: The protocol implementing such a style will retransmit messages
372: received with errors. It will also return error messages if one tries to
373: send a message after the connection has been broken.
374: Datagram communication does not use connections. Each message is
375: addressed individually. If the address is correct, it will generally
376: be received, although this is not guaranteed. Often datagrams are
377: used for requests that require a response from the
378: recipient. If no response
379: arrives in a reasonable amount of time, the request is repeated.
380: The individual datagrams will be kept separate when they are read, that
381: is, message boundaries are preserved.
382: .pp
383: The difference in performance between the two styles of communication is
384: generally less important than the difference in semantics. The
385: performance gain that one might find in using datagrams must be weighed
386: against the increased complexity of the program, which must now concern
387: itself with lost or out of order messages. If lost messages may simply be
388: ignored, the quantity of traffic may be a consideration. The expense
389: of setting up a connection is best justified by frequent use of the connection.
390: Since the performance of a protocol changes as it is tuned for different
391: situations, it is best to seek the most up-to-date information when
392: making choices for a program in which performance is crucial.
393: .pp
394: A protocol is a set of rules, data formats and conventions that regulate the
395: transfer of data between participants in the communication.
396: In general, there is one protocol for each socket type (stream,
397: datagram, etc.) within each domain.
398: The code that implements a protocol
399: keeps track of the names that are bound to sockets,
400: sets up connections and transfers data between sockets,
401: perhaps sending the data across a network.
402: This code also keeps track of the names that are bound to sockets.
403: It is possible for several protocols, differing only in low level
404: details, to implement the same style of communication within
405: a particular domain. Although it is possible to select
406: which protocol should be used, for nearly all uses it is sufficient to
407: request the default protocol. This has been done in all of the example
408: programs.
409: .pp
410: One specifies the domain, style and protocol of a socket when
411: it is created. For example, in Figure 5a the call to \fIsocket()\fP
412: causes the creation of a datagram socket with the default protocol
413: in the UNIX domain.
414: .b
415: .sh 1 "Datagrams in the UNIX Domain"
416: .r
417: .(z
418: .ft CW
419: .so udgramread.c
420: .ft
421: .ce 1
422: Figure 5a\ \ Reading UNIX domain datagrams
423: .)z
424: .pp
425: Let us now look at two programs that create sockets separately.
426: The programs in Figures 5a and 5b use datagram communication
427: rather than a stream.
428: The structure used to name UNIX domain sockets is defined
429: in the file \fI<sys/un.h>.\fP
430: The definition has also been included in the example for clarity.
431: .pp
432: Each program creates a socket with a call to \fIsocket().\fP
433: These sockets are in the UNIX domain.
434: Once a name has been decided upon it is attached to a socket by the
435: system call \fIbind().\fP
436: The program in Figure 5a uses the name ``socket'',
437: which it binds to its socket.
438: This name will appear in the working directory of the program.
439: The routines in Figure 5b use its
440: socket only for sending messages. It does not create a name for
441: the socket because no other process has to refer to it.
442: .(z
443: .ft CW
444: .so udgramsend.c
445: .ft
446: .ce 1
447: Figure 5b\ \ Sending a UNIX domain datagrams
448: .)z
449: .pp
450: Names in the UNIX domain are path names. Like file path names they may
451: be either absolute (e.g. ``/dev/imaginary'') or relative (e.g. ``socket'').
452: Because these names are used to allow processes to rendezvous, relative
453: path names can pose difficulties and should be used with care.
454: When a name is bound into the name space, a file (inode) is allocated in the
455: file system. If
456: the inode is not deallocated, the name will continue to exist even after
457: the bound socket is closed. This can cause subsequent runs of a program
458: to find that a name is unavailable, and can cause
459: directories to fill up with these
460: objects. The names are removed by calling \fIunlink()\fP or using
461: the \fIrm\fP\|(1) command.
462: Names in the UNIX domain are only used for rendezvous. They are not used
463: for message delivery once a connection is established. Therefore, in
464: contrast with the Internet domain, unbound sockets need not be (and are
465: not) automatically given addresses when they are connected.
466: .pp
467: There is no established means of communicating names to interested
468: parties. In the example, the program in Figure 5b gets the
469: name of the socket to which it will send its message through its
470: command line arguments. Once a line of communication has been created,
471: one can send the names of additional, perhaps new, sockets over the link.
472: Facilities will have to be built that will make the distribution of
473: names less of a problem than it now is.
474: .b
475: .sh 1 "Datagrams in the Internet Domain"
476: .r
477: .(z
478: .ft CW
479: .so dgramread.c
480: .ft
481: .ce 1
482: Figure 6a\ \ Reading Internet domain datagrams
483: .)z
484: .pp
485: The examples in Figure 6a and 6b are very close to the previous example
486: except that the socket is in the Internet domain.
487: The structure of Internet domain addresses is defined in the file
488: \fI<netinet/in.h>\fP.
489: Internet addresses specify a host address (a 32-bit number)
490: and a delivery slot, or port, on that
491: machine. These ports are managed by the system routines that implement
492: a particular protocol.
493: Unlike UNIX domain names, Internet socket names are not entered into
494: the file system and, therefore,
495: they do not have to be unlinked after the socket has been closed.
496: When a message must be sent between machines it is sent to
497: the protocol routine on the destination machine, which interprets the
498: address to determine to which socket the message should be delivered.
499: Several different protocols may be active on
500: the same machine, but, in general, they will not communicate with one another.
501: As a result, different protocols are allowed to use the same port numbers.
502: Thus, implicitly, an Internet address is a triple including a protocol as
503: well as the port and machine address.
504: An \fIassociation\fP is a temporary or permanent specification
505: of a pair of communicating sockets.
506: An association is thus identified by the tuple
507: <\fIprotocol, local machine address, local port,
508: remote machine address, remote port\fP>.
509: An association may be transient when using datagram sockets;
510: the association actually exists during a \fIsend\fP operation.
511: .(z
512: .ft CW
513: .so dgramsend.c
514: .ft
515: .ce 1
516: Figure 6b\ \ Sending an Internet domain datagram
517: .)z
518: .pp
519: The protocol for a socket is chosen when the socket is created. The
520: local machine address for a socket can be any valid network address of the
521: machine, if it has more than one, or it can be the wildcard value
522: INADDR_ANY.
523: The wildcard value is used in the program in Figure 6a.
524: If a machine has several network addresses, it is likely
525: that messages sent to any of the addresses should be deliverable to
526: a socket. This will be the case if the wildcard value has been chosen.
527: Note that even if the wildcard value is chosen, a program sending messages
528: to the named socket must specify a valid network address. One can be willing
529: to receive from ``anywhere,'' but one cannot send a message ``anywhere.''
530: The program in Figure 6b is given the destination host name as a command
531: line argument.
532: To determine a network address to which it can send the message, it looks
533: up
534: the host address by the call to \fIgethostbyname()\fP.
535: The returned structure includes the host's network address,
536: which is copied into the structure specifying the
537: destination of the message.
538: .pp
539: The port number can be thought of as the number of a mailbox, into
540: which the protocol places one's messages. Certain daemons, offering
541: certain advertised services, have reserved
542: or ``well-known'' port numbers. These fall in the range
543: from 1 to 1023. Higher numbers are available to general users.
544: Only servers need to ask for a particular number.
545: The system will assign an unused port number when an address
546: is bound to a socket.
547: This may happen when an explicit \fIbind\fP
548: call is made with a port number of 0, or
549: when a \fIconnect\fP or \fIsend\fP
550: is performed on an unbound socket.
551: Note that port numbers are not automatically reported back to the user.
552: After calling \fIbind(),\fP asking for port 0, one may call
553: \fIgetsockname()\fP to discover what port was actually assigned.
554: The routine \fIgetsockname()\fP
555: will not work for names in the UNIX domain.
556: .pp
557: The format of the socket address is specified in part by standards within the
558: Internet domain. The specification includes the order of the bytes in
559: the address. Because machines differ in the internal representation
560: they ordinarily use
561: to represent integers, printing out the port number as returned by
562: \fIgetsockname()\fP may result in a misinterpretation. To
563: print out the number, it is necessary to use the routine \fIntohs()\fP
564: (for \fInetwork to host: short\fP) to convert the number from the
565: network representation to the host's representation. On some machines,
566: such as 68000-based machines, this is a null operation. On others,
567: such as VAXes, this results in a swapping of bytes. Another routine
568: exists to convert a short integer from the host format to the network format,
569: called \fIhtons()\fP; similar routines exist for long integers.
570: For further information, refer to the
571: entry for \fIbyteorder\fP in section 3 of the manual.
572: .b
573: .sh 1 "Connections"
574: .r
575: .pp
576: To send data between stream sockets (having communication style SOCK_STREAM),
577: the sockets must be connected.
578: Figures 7a and 7b show two programs that create such a connection.
579: The program in 7a is relatively simple.
580: To initiate a connection, this program simply creates
581: a stream socket, then calls \fIconnect()\fP,
582: specifying the address of the socket to which
583: it wishes its socket connected. Provided that the target socket exists and
584: is prepared to handle a connection, connection will be complete,
585: and the program can begin to send
586: messages. Messages will be delivered in order without message
587: boundaries, as with pipes. The connection is destroyed when either
588: socket is closed (or soon thereafter). If a process persists
589: in sending messages after the connection is closed, a SIGPIPE signal
590: is sent to the process by the operating system. Unless explicit action
591: is taken to handle the signal (see the manual page for \fIsignal\fP
592: or \fIsigvec\fP),
593: the process will terminate and the shell
594: will print the message ``broken pipe.''
595: .(z
596: .ft CW
597: .so streamwrite.c
598: .ft
599: .ce 1
600: Figure 7a\ \ Initiating an Internet domain stream connection
601: .)z
602: .(z
603: .ft CW
604: .so streamread.c
605: .ft
606: .ce 1
607: Figure 7b\ \ Accepting an Internet domain stream connection
608: .sp 2
609: .ft CW
610: .so strchkread.c
611: .ft
612: .ce 1
613: Figure 7c\ \ Using select() to check for pending connections
614: .)z
615: .(z
616: -
617: .bl 5.8i
618: -
619: .\" accept.grn goes here
620: .sp
621: .ce 1
622: Figure 8\ \ Establishing a stream connection
623: .)z
624: .pp
625: Forming a connection is asymmetrical; one process, such as the
626: program in Figure 7a, requests a connection with a particular socket,
627: the other process accepts connection requests.
628: Before a connection can be accepted a socket must be created and an address
629: bound to it. This
630: situation is illustrated in the top half of Figure 8. Process 2
631: has created a socket and bound a port number to it. Process 1 has created an
632: unnamed socket.
633: The address bound to process 2's socket is then made known to process 1 and,
634: perhaps to several other potential communicants as well.
635: If there are several possible communicants,
636: this one socket might receive several requests for connections.
637: As a result, a new socket is created for each connection. This new socket
638: is the endpoint for communication within this process for this connection.
639: A connection may be destroyed by closing the corresponding socket.
640: .pp
641: The program in Figure 7b is a rather trivial example of a server. It
642: creates a socket to which it binds a name, which it then advertises.
643: (In this case it prints out the socket number.) The program then calls
644: \fIlisten()\fP for this socket.
645: Since several clients may attempt to connect more or less
646: simultaneously, a queue of pending connections is maintained in the system
647: address space. \fIListen()\fP
648: marks the socket as willing to accept connections and initializes the queue.
649: When a connection is requested, it is listed in the queue. If the
650: queue is full, an error status may be returned to the requester.
651: The maximum length of this queue is specified by the second argument of
652: \fIlisten()\fP; the maximum length is limited by the system.
653: Once the listen call has been completed, the program enters
654: an infinite loop. On each pass through the loop, a new connection is
655: accepted and removed from the queue, and, hence, a new socket for the
656: connection is created. The bottom half of Figure 8 shows the result of
657: Process 1 connecting with the named socket of Process 2, and Process 2
658: accepting the connection. After the connection is created, the
659: service, in this case printing out the messages, is performed and the
660: connection socket closed. The \fIaccept()\fP
661: call will take a pending connection
662: request from the queue if one is available, or block waiting for a request.
663: Messages are read from the connection socket.
664: Reads from an active connection will normally block until data is available.
665: The number of bytes read is returned. When a connection is destroyed,
666: the read call returns immediately. The number of bytes returned will
667: be zero.
668: .pp
669: The program in Figure 7c is a slight variation on the server in Figure 7b.
670: It avoids blocking when there are no pending connection requests by
671: calling \fIselect()\fP
672: to check for pending requests before calling \fIaccept().\fP
673: This strategy is useful when connections may be received
674: on more than one socket, or when data may arrive on other connected
675: sockets before another connection request.
676: .pp
677: The programs in Figures 9a and 9b show a program using stream communication
678: in the UNIX domain. Streams in the UNIX domain can be used for this sort
679: of program in exactly the same way as Internet domain streams, except for
680: the form of the names and the restriction of the connections to a single
681: file system. There are some differences, however, in the functionality of
682: streams in the two domains, notably in the handling of
683: \fIout-of-band\fP data (discussed briefly below). These differences
684: are beyond the scope of this paper.
685: .(z
686: .ft CW
687: .so ustreamwrite.c
688: .ft
689: .ce 1
690: Figure 9a\ \ Initiating a UNIX domain stream connection
691: .sp 2
692: .ft CW
693: .so ustreamread.c
694: .ft
695: .ce 1
696: Figure 9b\ \ Accepting a UNIX domain stream connection
697: .)z
698: .b
699: .sh 1 "Reads, Writes, Recvs, etc."
700: .r
701: .pp
702: UNIX 4.3BSD has several system calls for reading and writing information.
703: The simplest calls are \fIread() \fP and \fIwrite().\fP \fIWrite()\fP
704: takes as arguments the index of a descriptor, a pointer to a buffer
705: containing the data and the size of the data.
706: The descriptor may indicate either a file or a connected socket.
707: ``Connected'' can mean either a connected stream socket (as described
708: in Section 8) or a datagram socket for which a \fIconnect()\fP
709: call has provided a default destination (see the \fIconnect()\fP manual page).
710: \fIRead()\fP also takes a descriptor that indicates either a file or a socket.
711: \fIWrite()\fP requires a connected socket since no destination is
712: specified in the parameters of the system call.
713: \fIRead()\fP can be used for either a connected or an unconnected socket.
714: These calls are, therefore, quite flexible and may be used to
715: write applications that require no assumptions about the source of
716: their input or the destination of their output.
717: There are variations on \fIread() \fP and \fIwrite()\fP
718: that allow the source and destination of the input and output to use
719: several separate buffers, while retaining the flexibility to handle
720: both files and sockets. These are \fIreadv()\fP and \fI writev(),\fP
721: for read and write \fIvector.\fP
722: .pp
723: It is sometimes necessary to send high priority data over a
724: connection that may have unread low priority data at the
725: other end. For example, a user interface process may be interpreting
726: commands and sending them on to another process through a stream connection.
727: The user interface may have filled the stream with as yet unprocessed
728: requests when the user types
729: a command to cancel all outstanding requests.
730: Rather than have the high priority data wait
731: to be processed after the low priority data, it is possible to
732: send it as \fIout-of-band\fP
733: (OOB) data. The notification of pending OOB data results in the generation of
734: a SIGURG signal, if this signal has been enabled (see the manual
735: page for \fIsignal\fP or \fIsigvec\fP).
736: See [Leffler 1986] for a more complete description of the OOB mechanism.
737: There are a pair of calls similar to \fIread\fP and \fIwrite\fP
738: that allow options, including sending
739: and receiving OOB information; these are \fI send()\fP
740: and \fIrecv().\fP
741: These calls are used only with sockets; specifying a descriptor for a file will
742: result in the return of an error status. These calls also allow
743: \fIpeeking\fP at data in a stream.
744: That is, they allow a process to read data without removing the data from
745: the stream. One use of this facility is to read ahead in a stream
746: to determine the size of the next item to be read.
747: When not using these options, these calls have the same functions as
748: \fIread()\fP and \fIwrite().\fP
749: .pp
750: To send datagrams, one must be allowed to specify the destination.
751: The call \fIsendto()\fP
752: takes a destination address as an argument and is therefore used for
753: sending datagrams. The call \fIrecvfrom()\fP
754: is often used to read datagrams, since this call returns the address
755: of the sender, if it is available, along with the data.
756: If the identity of the sender does not matter, one may use \fIread()\fP
757: or \fIrecv().\fP
758: .pp
759: Finally, there are a pair of calls that allow the sending and
760: receiving of messages from multiple buffers, when the address of the
761: recipient must be specified. These are \fIsendmsg()\fP and
762: \fIrecvmsg().\fP
763: These calls are actually quite general and have other uses,
764: including, in the UNIX domain, the transmission of a file descriptor from one
765: process to another.
766: .pp
767: The various options for reading and writing are shown in Figure 10,
768: together with their parameters. The parameters for each system call
769: reflect the differences in function of the different calls.
770: In the examples given in this paper, the calls \fIread()\fP and
771: \fIwrite()\fP have been used whenever possible.
772: .(z
773: .ft CW
774: /*
775: * The variable descriptor may be the descriptor of either a file
776: * or of a socket.
777: */
778: cc = read(descriptor, buf, nbytes)
779: int cc, descriptor; char *buf; int nbytes;
780:
781: /*
782: * An iovec can include several source buffers.
783: */
784: cc = readv(descriptor, iov, iovcnt)
785: int cc, descriptor; struct iovec *iov; int iovcnt;
786:
787: cc = write(descriptor, buf, nbytes)
788: int cc, descriptor; char *buf; int nbytes;
789:
790: cc = writev(descriptor, iovec, ioveclen)
791: int cc, descriptor; struct iovec *iovec; int ioveclen;
792:
793: /*
794: * The variable ``sock'' must be the descriptor of a socket.
795: * Flags may include MSG_OOB and MSG_PEEK.
796: */
797: cc = send(sock, msg, len, flags)
798: int cc, sock; char *msg; int len, flags;
799:
800: cc = sendto(sock, msg, len, flags, to, tolen)
801: int cc, sock; char *msg; int len, flags;
802: struct sockaddr *to; int tolen;
803:
804: cc = sendmsg(sock, msg, flags)
805: int cc, sock; struct msghdr msg[]; int flags;
806:
807: cc = recv(sock, buf, len, flags)
808: int cc, sock; char *buf; int len, flags;
809:
810: cc = recvfrom(sock, buf, len, flags, from, fromlen)
811: int cc, sock; char *buf; int len, flags;
812: struct sockaddr *from; int *fromlen;
813:
814: cc = recvmsg(sock, msg, flags)
815: int cc, socket; struct msghdr msg[]; int flags;
816: .ft
817: .sp 1
818: .ce 1
819: Figure 10\ \ Varieties of read and write commands
820: .)z
821: .b
822: .sh 1 "Choices"
823: .r
824: .pp
825: This paper has presented examples of some of the forms
826: of communication supported by
827: Berkeley UNIX 4.3BSD. These have been presented in an order chosen for
828: ease of presentation. It is useful to review these options emphasizing the
829: factors that make each attractive.
830: .pp
831: Pipes have the advantage of portability, in that they are supported in all
832: UNIX systems. They also are relatively
833: simple to use. Socketpairs share this simplicity and have the additional
834: advantage of allowing bidirectional communication. The major shortcoming
835: of these mechanisms is that they require communicating processes to be
836: descendants of a common process. They do not allow intermachine communication.
837: .pp
838: The two communication domains, UNIX and Internet, allow processes with no common
839: ancestor to communicate.
840: Of the two, only the Internet domain allows
841: communication between machines.
842: This makes the Internet domain a necessary
843: choice for processes running on separate machines.
844: .pp
845: The choice between datagrams and stream communication is best made by
846: carefully considering the semantic and performance
847: requirements of the application.
848: Streams can be both advantageous and disadvantageous. One disadvantage
849: is that a process is only allowed a limited number of open streams,
850: as there are usually only 64 entries available in the open descriptor
851: table. This can cause problems if a single server must talk with a large
852: number of clients.
853: Another is that for delivering a short message the stream setup and
854: teardown time can be unnecessarily long. Weighed against this are
855: the reliability built into the streams. This will often be the
856: deciding factor in favor of streams.
857: .b
858: .sh 1 "What to do Next"
859: .r
860: .pp
861: Many of the examples presented here can serve as models for multiprocess
862: programs and for programs distributed across several machines.
863: In developing a new multiprocess program, it is often easiest to
864: first write the code to create the processes and communication paths.
865: After this code is debugged, the code specific to the application can
866: be added.
867: .pp
868: An introduction to the UNIX system and programming using UNIX system calls
869: can be found in [Kernighan and Pike 1984].
870: Further documentation of the Berkeley UNIX 4.3BSD IPC mechanisms can be
871: found in [Leffler et al. 1986].
872: More detailed information about particular calls and protocols
873: is provided in sections
874: 2, 3 and 4 of the
875: UNIX Programmer's Manual [CSRG 1986].
876: In particular the following manual pages are relevant:
877: .(b
878: .TS
879: l l.
880: creating and naming sockets socket(2), bind(2)
881: establishing connections listen(2), accept(2), connect(2)
882: transferring data read(2), write(2), send(2), recv(2)
883: addresses inet(4F)
884: protocols tcp(4P), udp(4P).
885: .TE
886: .)b
887: .(b
888: .sp
889: .b
890: Acknowledgements
891: .pp
892: I would like to thank Sam Leffler and Mike Karels for their help in
893: understanding the IPC mechanisms and all the people whose comments
894: have helped in writing and improving this report.
895: .pp
896: This work was sponsored by the Defense Advanced Research Projects Agency
897: (DoD), ARPA Order No. 4031, monitored by the Naval Electronics Systems
898: Command under contract No. N00039-C-0235.
899: The views and conclusions contained in this document are those of the
900: author and should not be interpreted as representing official policies,
901: either expressed or implied, of the Defense Research Projects Agency
902: or of the US Government.
903: .)b
904: .(b
905: .sp
906: .b
907: References
908: .r
909: .sp
910: .ls 1
911: B.W. Kernighan & R. Pike, 1984,
912: .i "The UNIX Programming Environment."
913: Englewood Cliffs, N.J.: Prentice-Hall.
914: .sp
915: .ls 1
916: B.W. Kernighan & D.M. Ritchie, 1978,
917: .i "The C Programming Language,"
918: Englewood Cliffs, N.J.: Prentice-Hall.
919: .sp
920: .ls 1
921: S.J. Leffler, R.S. Fabry, W.N. Joy, P. Lapsley, S. Miller & C. Torek, 1986,
922: .i "An Advanced 4.3BSD Interprocess Communication Tutorial."
923: Computer Systems Research Group,
924: Department of Electrical Engineering and Computer Science,
925: University of California, Berkeley.
926: .sp
927: .ls 1
928: Computer Systems Research Group, 1986,
929: .i "UNIX Programmer's Manual, 4.3 Berkeley Software Distribution."
930: Computer Systems Research Group,
931: Department of Electrical Engineering and Computer Science,
932: University of California, Berkeley.
933: .)b
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.