|
|
1.1 root 1: .SH
2: Stub Generators and the X Protocol
3: .PP
4: The X protocol is not a remote procedure call protocol as
5: defined in the literature [4,5],
6: as client calls are not given the same guarantee of completion and
7: error handling that an RPC protocol provides.
8: The X protocol transports fairly large amounts of data and
9: executes many more requests than typically seen in true RPC systems.
10: Given this generation of display hardware and processors,
11: X may handle greater than 1000 requests/second from client applications to
12: a fast display.
13: .PP
14: X clients only block when they need information from the server.
15: Performance would be unacceptable if X were a synchronous RPC protocol,
16: both because of round trip times and because of system call overhead.
17: This is the most significant difference between X and its predecessor
18: W, written by Paul Asente of Stanford University.
19: On the other hand,
20: a procedural interface to the window system is essential for easy use.
21: We spent much time crafting the procedure stubs for the several
22: library interfaces built during X development.
23: .PP
24: The original implementation of the client library would always
25: write each request at the time the request was made.
26: This implies a write system call per X request.
27: There was implicit buffering from the start in the connection to
28: the server due to the stream connection.
29: Over a year ago, we received new firmware for the Vs100, and
30: were no longer able to keep up with the display.
31: We changed the client library to buffer the requests in a manner
32: similar to the standard I/O library; this improved performance dramatically,
33: as the client library performs many fewer write system calls.
34: .PP
35: Many current RPC [6] argument martialing
36: mechanisms perform at least one procedure
37: call per procedure argument to martial that argument.
38: This is almost certainly too expensive to use for this application.
39: Even if martialing the argument took no time in the procedure,
40: the call overhead would account for ~10% of the CPU.
41: Stub generators need to be able to emit direct assignment code for
42: simple argument types.
43: Complex argument types can probably afford a procedure call,
44: but these are not common in the current X design.
45: .PP
46: Proper stub generation tools would have saved several months over the
47: course of the project,
48: had they been available at the proper time.
49: Arguments could be made that the hand-crafted stubs in the X client library
50: are more efficient than machine generated stubs would have been.
51: On the other hand, to keep the protocol simple, X often
52: sends requests with unused data, for which it pays with higher communications
53: cost.
54: It would be instructive to reimplement X using such a stub generator and
55: see the relative performance between it and the current mechanism.
56: .PP
57: Machine dependencies in such transport mechanisms need further work.
58: The protocol design deserves careful study.
59: Issues such as byte swapping cannot be ignored.
60: With strictly blocking RPC, the overhead per request is already so
61: high that network byte order is probably not too expensive,
62: given the current implementation of RPC systems on
63: .UX .
64: With the higher performance of the X protocol,
65: this issue becomes significant.
66: It is desirable that two machines of the same architecture
67: pay no penalty in performance in the transport protocols.
68: Our solution was to define two ports that the X server listens at,
69: one for VAX byte order connections, and one for 68000 byte order connections.
70: At a late stage of X development,
71: after X client code had already been ported to a Sun workstation
72: and would interoperate with a VAX display,
73: another different machine architecture showed that the protocol was
74: not as conservatively designed as we would like.
75: Care should be taken in protocol design that all data
76: be aligned naturally (words on word boundaries, longwords on
77: longword boundaries, and so on) to ensure portability of code
78: implementing them.
79: .PP
80: X would not be feasible if round trip process to process times over TCP
81: were too long.
82: On a MicroVAX\(dg II running Ultrix\(dd,
83: or on a VAX 11/780 running 4.2, these times
84: have been measured between 20 and 25 milliseconds using TCP.
85: .FS
86: \(dg VAX is a trademark of Digital Equipment Corporation.
87: .sp
88: \(dd Ultrix is a trademark of Digital Equipment Corporation.
89: .FE
90: As this time degrades, interactive "feel" becomes worse,
91: as we have chosen to put as much as possible in client code.
92: Birrell and Nelson report
93: much lower times using carefully crafted and
94: tuned RPC protocols on faster hardware; even extrapolating
95: for differences in hardware,
96: .UX
97: may be several times slower than it could be.
98: Given a much faster kernel message interface, one should be able to
99: improve on the current times substantially.
100: The X protocol requires reliable in order delivery of messages.
101: .PP
102: The argument against using such specific message mechanisms are:
103: 1) the buffering provided by the stream layer is used to good advantage
104: at the server and client ends of the transmissions.
105: 2) Lless interoperability.
106: X has been run over both
107: TCP and DECNET, and would be simple to build a forwarder between
108: the domains if needed.
109: This reduces the number of system calls required to get the data
110: from the kernel at either end, particularly when loaded.
111: .PP
112: These times have been improved somewhat by optimizing
113: the local TCP connection, and could be further improved
114: by using
115: .UX
116: domain connections in the local case.
117: .PP
118: In general
119: .UX
120: needs a much cheaper message passing transport mechanism
121: than can currently
122: be built on top of existing 4.2BSD facilities.
123: Stub generators need serious work both for RPC systems
124: and other message systems
125: particularly in light of some of the issues discussed above.
126: We would make a plea that there be further serious study of
127: non-blocking protocols[7].
128: There should be some way to read multiple packets from the kernel
129: in a single system call for efficient implementation of
130: RPC and other protocols.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.