|
|
1.1 root 1: .\" @(#)network.ms 5.3 (Berkeley) 5/25/86
2: .\"
3: .EH 'SMM:21-%''A Dial-Up Network of \s-2UNIX\s+2 Systems'
4: .OH 'Dial-Up Network of \s-2UNIX\s+2 Systems''SMM:21-%'
5: .if n .ls 2
6: .ds RH Nowitz
7: .ND "August 18, 1978"
8: .TL
9: A Dial-Up Network of
10: UNIX\s6\uTM\d\s0
11: Systems
12: .AU
13: D. A. Nowitz
14: .AU
15: M. E. Lesk
16: .AI
17: .MH
18: .AB
19: .if n .ls 2
20: A network of over eighty
21: .UX
22: computer systems has been established using the
23: telephone system as its primary communication medium.
24: The network was designed to meet the growing demands for
25: software distribution and exchange.
26: Some advantages of our design are:
27: .IP -
28: The startup cost is low.
29: A system needs only a dial-up port,
30: but systems with automatic calling units have much more
31: flexibility.
32: .IP -
33: No operating system changes are required to install or use the system.
34: .IP -
35: The communication is basically over dial-up lines,
36: however, hardwired communication lines can be used
37: to increase speed.
38: .IP -
39: The command for sending/receiving files is simple to use.
40: .sp
41: Keywords: networks, communications, software distribution, software maintenance
42: .AE
43: .NH
44: Purpose
45: .PP
46: The widespread use of the
47: .UX
48: system
49: .[
50: ritchie thompson bstj 1978
51: .]
52: within Bell Laboratories
53: has produced problems of software distribution and maintenance.
54: A conventional mechanism was set up to distribute the operating
55: system and associated programs from a central site to the
56: various users.
57: However this mechanism alone does not meet all software
58: distribution needs.
59: Remote sites generate much software and must transmit it to
60: other sites.
61: Some
62: .UX
63: systems
64: are themselves central sites for redistribution
65: of a particular specialized utility,
66: such as the Switching Control Center System.
67: Other sites have particular, often long-distance needs for
68: software exchange; switching research,
69: for example, is carried on in
70: New Jersey, Illinois, Ohio, and Colorado.
71: In addition, general purpose utility programs are written at
72: all
73: .UX
74: system sites.
75: The
76: .UX
77: system is modified
78: and enhanced by many people in many places and
79: it would be very constricting to deliver new software in a one-way
80: stream without any alternative
81: for the user sites to respond with changes of their own.
82: .PP
83: Straightforward software distribution is only part of the problem.
84: A large project may exceed the capacity of a single computer and
85: several machines may be used by the one group of people.
86: It then becomes necessary
87: for them to pass messages, data and other information back an forth
88: between computers.
89: .PP
90: Several groups with similar problems, both inside and outside of
91: Bell Laboratories, have constructed networks built of
92: hardwired connections only.
93: .[
94: dolotta mashey 1978 bstj
95: .]
96: .[
97: network unix system chesson
98: .]
99: Our network, however, uses both dial-up and hardwired
100: connections so that service can be provided to as many sites as possible.
101: .NH
102: Design Goals
103: .PP
104: Although some of our machines are connected directly, others
105: can only communicate over low-speed dial-up lines.
106: Since the dial-up lines are often unavailable
107: and file transfers may take considerable time,
108: we spool all work and transmit in the background.
109: We also had to adapt to a community of systems which are independently
110: operated and resistant to suggestions that they should all
111: buy particular hardware or install particular operating system
112: modifications.
113: Therefore, we make minimal demands on the local sites
114: in the network.
115: Our implementation requires no operating system changes;
116: in fact, the transfer programs look like any other user
117: entering the system through the normal dial-up login ports,
118: and obeying all local protection rules.
119: .PP
120: We distinguish ``active'' and ``passive'' systems
121: on the network.
122: Active systems have an automatic calling unit
123: or a hardwired line to another system,
124: and can initiate a connection.
125: Passive systems do not have the hardware
126: to initiate a connection.
127: However, an
128: active system can be assigned the job of calling passive
129: systems and executing work found there;
130: this makes a passive system the functional equivalent of
131: an active system, except for an additional delay while it waits to be polled.
132: Also, people frequently log into active systems and
133: request copying from one passive system to another.
134: This requires two telephone calls, but even so, it is faster
135: than mailing tapes.
136: .PP
137: Where convenient, we use hardwired communication lines.
138: These permit much faster transmission and multiplexing
139: of
140: the communications link.
141: Dial-up connections are made at either 300 or 1200 baud;
142: hardwired connections are asynchronous up to 9600 baud
143: and might run even faster on special-purpose communications
144: hardware.
145: .[
146: fraser spider 1974 ieee
147: .]
148: .[
149: fraser channel network datamation 1975
150: .]
151: Thus, systems typically join our network first as
152: passive systems and when
153: they find the service more important, they acquire
154: automatic calling units and become active
155: systems; eventually, they may install high-speed
156: links to particular machines with which they
157: handle a great deal of traffic.
158: At no point, however, must users change their
159: programs or procedures.
160: .PP
161: The basic operation of the network is very simple.
162: Each participating system has a spool directory,
163: in which work to be done (files to be moved, or commands to be executed
164: remotely) is stored.
165: A standard program,
166: .I uucico ,
167: performs all transfers.
168: This program starts by identifying a particular communication channel
169: to a remote system with which it will hold a conversation.
170: .I Uucico
171: then selects a device and establishes the connection,
172: logs onto the remote machine
173: and starts the
174: .I uucico
175: program on the remote machine.
176: Once two of these programs are connected, they first agree on a line protocol,
177: and then start exchanging work.
178: Each program in turn, beginning with the calling (active system) program,
179: transmits everything it needs, and then asks the other what it wants done.
180: Eventually neither has any more work, and both exit.
181: .PP
182: In this way, all services are available from all sites; passive sites,
183: however, must wait until called.
184: A variety of protocols may be used; this conforms to the real,
185: non-standard world.
186: As long as the caller and called programs have a protocol in common,
187: they can communicate.
188: Furthermore, each caller knows the hours when each destination system
189: should be called.
190: If a destination is unavailable, the data intended for it
191: remain in the spool directory until the destination machine can be reached.
192: .PP
193: The implementation of this
194: Bell Laboratories network
195: between independent sites, all of which
196: store proprietary programs and data,
197: illustratives the pervasive need for security
198: and administrative controls over file access.
199: Each site, in configuring its programs and system files,
200: limits and monitors transmission.
201: In order to access a file a user needs access permission
202: for the machine that contains the file and access permission
203: for the file itself.
204: This is achieved by first requiring the user to use his password
205: to log into his local machine and then his local
206: machine logs into the remote machine whose files are to be accessed.
207: In addition, records are kept identifying all files
208: that are moved into and out of the local system,
209: and how the requestor of such accesses identified
210: himself.
211: Some sites may arrange
212: to permit users only
213: to call up
214: and request work to be done;
215: the calling users are then called back
216: before the work is actually done.
217: It is then possible to verify
218: that the request is legitimate from the standpoint of the
219: target system, as well as the originating system.
220: Furthermore, because of the call-back,
221: no site can masquerade as another
222: even if it knows all the necessary passwords.
223: .PP
224: Each machine can optionally maintain a sequence count for
225: conversations with other machines and require a verification of the
226: count at the start of each conversation.
227: Thus, even if call back is not in use, a successful masquerade requires
228: the calling party to present the correct sequence number.
229: A would-be impersonator must not just steal the correct phone number,
230: user name, and password, but also the sequence count, and must call in
231: sufficiently promptly to precede the next legitimate request from either side.
232: Even a successful masquerade will be detected on the next correct
233: conversation.
234: .NH
235: Processing
236: .PP
237: The user has two commands which set up communications,
238: .I uucp
239: to set up file copying,
240: and
241: .I uux
242: to set up command execution where some of the required
243: resources (system and/or files)
244: are not on the local machine.
245: Each of these commands will put work and data files
246: into the spool directory for execution by
247: .I uucp
248: daemons.
249: Figure 1 shows the major blocks of the file transfer process.
250: .SH
251: File Copy
252: .PP
253: The
254: .I uucico
255: program is used to perform all communications between
256: the two systems.
257: It performs the following functions:
258: .RS
259: .IP - 3
260: Scan the spool directory for work.
261: .IP -
262: Place a call to a remote system.
263: .IP -\ \
264: Negotiate a line protocol to be used.
265: .IP -\ \
266: Start program
267: .I uucico
268: on the remote system.
269: .IP -\ \
270: Execute all requests from both systems.
271: .IP -\ \
272: Log work requests and work completions.
273: .RE
274: .LP
275: .I Uucico
276: may be started in several ways;
277: .RS
278: .IP a) 5
279: by a system daemon,
280: .IP b)
281: by one of the
282: .I uucp
283: or
284: .I uux
285: programs,
286: .IP c)
287: by a remote system.
288: .RE
289: .SH
290: Scan For Work
291: .PP
292: The file names in the spool directory are constructed to allow the
293: daemon programs
294: .I "(uucico, uuxqt)"
295: to determine the files they should look at,
296: the remote machines they should call
297: and the order in which the files for a particular
298: remote machine should be processed.
299: .SH
300: Call Remote System
301: .PP
302: The call is made using information from several
303: files which reside in the uucp program directory.
304: At the start of the call process, a lock is
305: set on the system being called so that another
306: call will not be attempted at the same time.
307: .PP
308: The system name is found in a
309: ``systems''
310: file.
311: The information contained for each system is:
312: .IP
313: .RS
314: .IP [1]
315: system name,
316: .IP [2]
317: times to call the system
318: (days-of-week and times-of-day),
319: .IP [3]
320: device or device type to be used for call,
321: .IP [4]
322: line speed,
323: .IP [5]
324: phone number,
325: .IP [6]
326: login information (multiple fields).
327: .RE
328: .PP
329: The time field is checked against the present time to see
330: if the call should be made.
331: The
332: .I
333: phone number
334: .R
335: may contain abbreviations (e.g. ``nyc'', ``boston'') which get translated into dial
336: sequences using a
337: ``dial-codes'' file.
338: This permits the same ``phone number'' to be stored at every site, despite
339: local variations in telephone services and dialing conventions.
340: .PP
341: A ``devices''
342: file is scanned using fields [3] and [4] from the
343: ``systems''
344: file to find an available device for the connection.
345: The program will try all devices which satisfy
346: [3] and [4] until a connection is made, or no more
347: devices can be tried.
348: If a non-multiplexable device is successfully opened, a lock file
349: is created so that another copy of
350: .I uucico
351: will not try to use it.
352: If the connection is complete, the
353: .I
354: login information
355: .R
356: is used to log into the remote system.
357: Then
358: a command is sent to the remote system
359: to start the
360: .I uucico
361: program.
362: The conversation between the two
363: .I uucico
364: programs begins with a handshake started by the called,
365: .I SLAVE ,
366: system.
367: The
368: .I SLAVE
369: sends a message to let the
370: .I MASTER
371: know it is ready to receive the system
372: identification and conversation sequence number.
373: The response from the
374: .I MASTER
375: is
376: verified by the
377: .I SLAVE
378: and if acceptable, protocol selection begins.
379: .SH
380: Line Protocol Selection
381: .PP
382: The remote system sends a message
383: .IP "" 12
384: P\fIproto-list\fR
385: .LP
386: where
387: .I proto-list
388: is a string of characters, each
389: representing a line protocol.
390: The calling program checks the proto-list
391: for a letter corresponding to an available line
392: protocol and returns a
393: .I use-protocol
394: message.
395: The
396: .I use-protocol
397: message is
398: .IP "" 12
399: U\fIcode\fR
400: .LP
401: where code is either a one character
402: protocol letter or a
403: .I N
404: which means there is no common protocol.
405: .PP
406: Greg Chesson designed and implemented the standard
407: line protocol used by the uucp transmission program.
408: Other protocols may be added by individual installations.
409: .SH
410: Work Processing
411: .PP
412: During processing, one program is the
413: .I MASTER
414: and the other is
415: .I SLAVE .
416: Initially, the calling program is the
417: .I MASTER.
418: These roles may switch one or more times during
419: the conversation.
420: .PP
421: There are four messages used during the
422: work processing, each specified by the first
423: character of the message.
424: They are
425: .KS
426: .TS
427: center;
428: c l.
429: S send a file,
430: R receive a file,
431: C copy complete,
432: H hangup.
433: .TE
434: .KE
435: .LP
436: The
437: .I MASTER
438: will send
439: .I R
440: or
441: .I S
442: messages until all work from the spool directory is
443: complete, at which point an
444: .I H
445: message will be sent.
446: The
447: .I SLAVE
448: will reply with
449: \fISY\fR, \fISN\fR, \fIRY\fR, \fIRN\fR, \fIHY\fR, \fIHN\fR,
450: corresponding to
451: .I yes
452: or
453: .I no
454: for each request.
455: .PP
456: The send and receive replies are
457: based on permission to access the
458: requested file/directory.
459: After each file is copied into the spool directory
460: of the receiving system,
461: a copy-complete message is sent by the receiver of the file.
462: The message
463: .I CY
464: will be sent if the
465: .UX
466: .I cp
467: command, used to copy from the spool directory, is successful.
468: Otherwise, a
469: .I CN
470: message is sent.
471: The requests and results are logged on both systems,
472: and, if requested, mail is sent to the user reporting completion
473: (or the user can request status information from the log program at any time).
474: .PP
475: The hangup response is determined by the
476: .I SLAVE
477: program by a work scan of the spool directory.
478: If work for the remote system exists in the
479: .I SLAVE's
480: spool directory, a
481: .I HN
482: message is sent and the programs switch roles.
483: If no work exists, an
484: .I HY
485: response is sent.
486: .PP
487: A sample conversation is shown in Figure 2.
488: .SH
489: Conversation Termination
490: .PP
491: When a
492: .I HY
493: message is received by the
494: .I MASTER
495: it is echoed back to the
496: .I SLAVE
497: and the protocols are turned off.
498: Each program sends a final "OO" message to the
499: other.
500: .NH
501: Present Uses
502: .PP
503: One application of this software is remote mail.
504: Normally, a
505: .UX
506: system user
507: writes ``mail dan'' to send mail to
508: user ``dan''.
509: By writing ``mail usg!dan''
510: the mail is sent to user
511: ``dan''
512: on system ``usg''.
513: .PP
514: The primary uses of our network to date have been in software maintenance.
515: Relatively few of the bytes passed between systems are intended for
516: people to read.
517: Instead, new programs (or new versions of programs)
518: are sent to users, and potential bugs are returned to authors.
519: Aaron Cohen has implemented a
520: ``stockroom'' which allows remote users to call in and request software.
521: He keeps a ``stock list'' of available programs, and new bug
522: fixes and utilities are added regularly.
523: In this way, users can always obtain the latest version of anything
524: without bothering the authors of the programs.
525: Although the stock list is maintained on a particular system,
526: the items in the stockroom may be warehoused in many places;
527: typically each program is distributed from the home site of
528: its author.
529: Where necessary, uucp does remote-to-remote copies.
530: .PP
531: We also routinely retrieve test cases from other systems
532: to determine whether errors on remote systems are caused
533: by local misconfigurations or old versions of software,
534: or whether they are bugs that must be fixed at the home site.
535: This helps identify errors rapidly.
536: For one set of test programs maintained by us,
537: over 70% of the bugs reported from remote sites
538: were due to old software, and were fixed
539: merely by distributing the current version.
540: .PP
541: Another application of the network for software maintenance
542: is to compare files on two different machines.
543: A very useful utility on one machine has been
544: Doug McIlroy's ``diff'' program
545: which compares two text files and indicates the differences,
546: line by line, between them.
547: .[
548: hunt mcilroy file
549: .]
550: Only lines which are
551: not identical are printed.
552: Similarly,
553: the program ``uudiff''
554: compares files (or directories) on two machines.
555: One of these directories may be on a passive system.
556: The
557: ``uudiff'' program
558: is set up to work similarly to the inter-system mail, but it is slightly
559: more complicated.
560: .PP
561: To avoid moving large numbers of usually identical
562: files,
563: .I uudiff
564: computes file checksums
565: on each side, and only moves files that are different
566: for detailed comparison.
567: For large files, this process can be iterated; checksums can be computed
568: for each line, and only those lines that are different
569: actually moved.
570: .PP
571: The ``uux'' command has
572: been useful for providing remote output.
573: There are some machines which do not have hard-copy
574: devices, but which are connected over 9600 baud
575: communication lines to machines with printers.
576: The
577: .I uux
578: command allows the formatting of the
579: printout on the local machine and printing on the
580: remote machine using standard
581: .UX
582: command programs.
583: .br
584: .NH
585: Performance
586: .PP
587: Throughput, of course, is primarily dependent on transmission speed.
588: The table below shows the real throughput of characters
589: on communication links of different speeds.
590: These numbers represent actual data transferred;
591: they do not include bytes used by the line protocol for
592: data validation such as checksums and messages.
593: At the higher speeds, contention for the processors on both
594: ends prevents the network from driving the line full speed.
595: The range of speeds represents the difference between light and
596: heavy loads on the two systems.
597: If desired, operating system modifications can
598: be installed
599: that permit full use of even very fast links.
600: .KS
601: .TS
602: center;
603: c c
604: n n.
605: Nominal speed Characters/sec.
606: 300 baud 27
607: 1200 baud 100-110
608: 9600 baud 200-850
609: .TE
610: .KE
611: In addition to the transfer time, there is some overhead
612: for making the connection and logging in ranging from
613: 15 seconds to 1 minute.
614: Even at 300 baud, however, a typical 5,000 byte source program
615: can be transferred in
616: four minutes instead of the 2 days that might be required
617: to mail a tape.
618: .PP
619: Traffic between systems is variable. Between two
620: closely related systems,
621: we observed
622: 20 files moved and 5 remote commands executed in a typical day.
623: A more normal traffic out of a single system would be around
624: a dozen files per day.
625: .PP
626: The total number of sites at present
627: in the main network is
628: 82, which includes most of the Bell Laboratories
629: full-size machines
630: which run the
631: .UX
632: operating system.
633: Geographically, the machines range from Andover, Massachusetts to
634: Denver, Colorado.
635: .PP
636: Uucp has also
637: been used to set up another network
638: which connects a group of
639: systems in operational sites with the home site.
640: The two networks touch at one
641: Bell Labs computer.
642: .NH
643: Further Goals
644: .PP
645: Eventually, we would like to develop a full system of remote software
646: maintenance.
647: Conventional maintenance (a support group which mails tapes)
648: has many well-known disadvantages.
649: .[
650: brooks mythical man month 1975
651: .]
652: There are distribution errors and delays, resulting in old software
653: running at remote sites and old bugs continually reappearing.
654: These difficulties are aggravated when
655: there are 100 different small systems, instead of a few large ones.
656: .PP
657: The availability of file transfer on a network of compatible operating
658: systems
659: makes it possible just to send programs directly to the end user who wants them.
660: This avoids the bottleneck of negotiation and packaging in the central support
661: group.
662: The ``stockroom'' serves this function for new utilities
663: and fixes to old utilities.
664: However, it is still likely that distributions will not be sent
665: and installed as often as needed.
666: Users are justifiably suspicious of the ``latest version'' that has just
667: arrived; all too often it features the ``latest bug.''
668: What is needed is to address both problems simultaneously:
669: .IP 1.
670: Send distributions whenever programs change.
671: .IP 2.
672: Have sufficient quality control so that users will install them.
673: .LP
674: To do this, we recommend systematic regression testing both on the
675: distributing and receiving systems.
676: Acceptance testing on the receiving systems can be automated and
677: permits the local system to ensure that its essential work can continue
678: despite the constant installation of changes sent from elsewhere.
679: The work of writing the test sequences should be recovered in lower
680: counseling and distribution costs.
681: .PP
682: Some slow-speed network services are also being implemented.
683: We now have inter-system ``mail'' and ``diff,''
684: plus the many implied commands represented by ``uux.''
685: However, we still need inter-system ``write'' (real-time inter-user
686: communication) and ``who'' (list of people logged in
687: on different systems).
688: A slow-speed network of this sort may be very useful
689: for speeding up counseling and education, even
690: if not fast enough for the distributed data base
691: applications that attract many users to networks.
692: Effective use of remote execution over slow-speed lines, however,
693: must await the general installation of multiplexable channels so
694: that long file transfers do not lock out short inquiries.
695: .NH
696: Lessons
697: .PP
698: The following is a summary of the lessons we learned in
699: building these programs.
700: .IP 1.
701: By starting your network in a way that requires no hardware or major operating system
702: changes, you can get going quickly.
703: .IP 2.
704: Support will follow use.
705: Since the network existed and was being used, system maintainers
706: were easily persuaded to help keep it operating, including purchasing
707: additional hardware to speed traffic.
708: .IP 3.
709: Make the network commands look like local commands.
710: Our users have a resistance to learning anything new:
711: all the inter-system commands look very similar to
712: standard
713: .UX
714: system
715: commands so that little training cost
716: is involved.
717: .IP 4.
718: An initial error was not coordinating enough
719: with existing communications projects: thus, the first
720: version of this network was restricted to dial-up, since
721: it did not support the various hardware links between systems.
722: This has been fixed in the current system.
723: .SH
724: Acknowledgements
725: .PP
726: We thank G. L. Chesson for his design and implementation
727: of the packet driver and protocol, and A. S. Cohen, J. Lions,
728: and P. F. Long for their suggestions and assistance.
729: .[
730: $LIST$
731: .]
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.