|
|
1.1 root 1: @device(postscript)
2: @make(article)
3: @style(references=cacm)
4: @set(page=+1)
5:
6: @majorheading(The X Window System)
7: @center(Robert W. Scheifler@footnote( 545 Technology Square, Cambridge, MA 02139.)
8: MIT Laboratory for Computer Science
9:
10: Jim Gettys@footnote( Project Athena, MIT, Cambridge, MA 02139.)
11: Digital Equipment Corporation
12: MIT Project Athena
13:
14: July 1986
15: Revised October 1986@footnote( To appear in Transactions on Graphics #63,
16: Special Issue on User Interface Software, Copyright 1986,
17: Association for Computing Machinery. Permission to copy without fee all or
18: part of this material is granted provided that the copies are not made or
19: distributed for direct commercial advantage, the ACM copyright notice and the
20: title of the publication and its date appear,
21: and notice is given that copying is by permission of the Association for
22: Computing Machinery.
23: To copy otherwise, or to republish requires a fee and/or specific permission.)
24:
25: @blankspace(2 lines)
26:
27: @begin(abstract)
28:
29: An overview of the X Window System is presented, focusing on the system
30: substrate and the low-level facilities provided to build applications and to
31: manage the desktop. The system provides high-performance, high-level,
32: device-independent graphics. A hierarchy of resizable, overlapping windows
33: allows a wide variety of application and user interfaces to be built easily.
34: Network-transparent access to the display provides an important degree of
35: functional separation, without significantly affecting performance, that is
36: crucial to building applications for a distributed environment. To a
37: reasonable extent, desktop management can be custom tailored to individual
38: environments, without modifying the base system and typically without affecting
39: applications.
40:
41: Categories and Subject Descriptors: C.2.2 [@b(Computer-Communication Networks)]:
42: Network Protocols - @i(protocol architecture); C.2.4 [@b(Computer-Communication
43: Networks)]: Distributed Systems - @i(distributed applications); D.4.4 [@b(Operating
44: Systems)]: Communication Management - @i(network communication, terminal management);
45: H.1.2 [@b(Information Systems)]: User/Machine Systems - @i(human factors); I.3.2
46: [@b(Computer Graphics)]: Graphic Systems - @i(distributed/network graphics);
47: I.3.4 [@b(Computer Graphics)]: Graphics Utilities - @i(graphics packages, software
48: support); I.3.6 [@b(Computer Graphics)]: Methodology and Techniques - @i(device
49: independence, interaction techniques)
50:
51: General terms: Design, Experimentation, Human Factors, Standardization
52:
53: Additional Key Words and Phrases: window systems, window managers, virtual terminals
54:
55: @end(abstract)
56:
57: @section(Introduction)
58:
59: The X Window System (or simply X) developed at MIT has achieved fairly
60: widespread popularity recently, particularly in the Unix@footnote( Unix is a
61: trademark of AT&T Bell Laboratories.) community. In this paper, we present an
62: overview of X, focusing on the system substrate and the low-level facilities
63: provided to build applications and to manage the desktop. In X, this base
64: window system provides high-performance graphics to a hierarchy of resizable
65: windows. Rather than mandating a particular user interface, X provides
66: primitives to support several policies and styles. Unlike most window systems,
67: the base system in X is defined by a @i(network protocol): asynchronous
68: stream-based inter-process communication replaces the traditional procedure
69: call or kernel call interface. An application can utilize windows on any
70: display in a network in a device-independent, network-transparent fashion.
71: Interposing a network connection greatly enhances the utility of the window
72: system, without significantly affecting performance. The performance of
73: existing X implementations is comparable to contemporary window systems, and in
74: general is limited by display hardware rather than network communication. For
75: example, 19500 characters per second and 3500 short vectors per second are
76: possible on Digital Equipment Corporation's VAXStation-II/GPX, both locally and
77: over a local area network, and these figures are very close to the limits of
78: the display hardware.
79:
80: X is the result of the simultaneous need for a window system from two separate
81: groups at MIT. In the summer of 1984, the Argus system@cite(argus) at the
82: Laboratory for Computer Science needed a debugging environment for
83: multi-process distributed applications, and a window system seemed the only
84: viable solution. Project Athena@cite(athena) was faced with dozens, and
85: eventually thousands of workstations with bitmap displays, and needed a window
86: system to make the displays useful. Both groups were starting with the Digital
87: VS100 display@cite(vs100) and VAX hardware, but it was clear at the outset that
88: other architectures and displays had to be supported. In particular, equal
89: numbers of IBM workstations with bitmap displays of unknown type were expected
90: eventually within Project Athena. Portability was therefore a goal from the
91: start. Although all of the initial implementation work was for Berkeley Unix,
92: it was clear that the network protocol should not depend on aspects of the
93: operating system.
94:
95: The name X derives from the lineage of the system. At Stanford University,
96: Paul Asente and Brian Reid had begun work on the W window system@cite(w), as an
97: alternative to VGTS@cite(vgts1,vgts2) for the V system@cite(v). Both VGTS and
98: W allow network-transparent access to the display, using the synchronous V
99: communication mechanism. Both systems provide "text" windows for ASCII
100: terminal emulation. VGTS provides graphics windows driven by fairly high-level
101: object definitions from a structured display file; W provides graphics windows
102: based on a simple display-list mechanism, with limited functionality. We
103: acquired a Unix-based version of W for the VS100 (with synchronous
104: communication over TCP@cite(tcp)) done by Asente and Chris Kent at Digital's
105: Western Research Laboratory. From just a few days of experimentation, it was
106: clear that a network-transparent hierarchical window system was desirable, but
107: that restricting the system to any fixed set of application-specific modes was
108: completely inadequate. It was also clear that, although synchronous
109: communication was perhaps acceptable in the V system (due to very fast
110: networking primitives), it was completely inadequate in most other operating
111: environments. X is our "reaction" to W. The X window hierarchy comes directly
112: from W, although numerous systems have been built with hierarchy in at least
113: some form@cite(lucasfilm,star1,lispm,sunwin,mg1,genera,cedar,metheus,tajo).
114: The asynchronous communication protocol used in X is a significant improvement
115: over the synchronous protocol used in W, but is very similar to that used in
116: Andrew@cite(wm,andrew). X differs from all of these systems in the degree to
117: which both graphics functions and "system" functions are pushed back (across
118: the network) as application functions, and in the ability to transparently
119: tailor desktop management.
120:
121: The next section presents several high-level requirements that we believe a
122: window system must satisfy to be a viable standard in a network environment,
123: and indicates where the design of X fails to meet some of these requirements.
124: In Section 3 we describe the overall X system model, and the effect of
125: network-based communication on that model. Section 4 describes the structure
126: of windows, and the primitives for manipulating that structure. Section 5
127: explains the color model used in X, and Section 6 presents the text and
128: graphics facilities. Section 7 discusses the issues of window exposure and
129: refresh, and their resolution in X. Section 8 deals with input event handling.
130: In Section 9, we describe the mechanisms for desktop management.
131:
132: This paper describes the version@footnote( Version 10.) of X that is currently
133: in widespread use. The design of this version is inadequate in several
134: respects. With our experience to date, and encouraged by the number of
135: universities and manufacturers taking a serious interest in X, we have designed
136: a new version that should satisfy a significantly wider community. Section 10
137: discusses a number of problems with the current X design, and gives a general
138: idea of what changes are contemplated.
139:
140: @section(Requirements)
141:
142: A window system contains many interfaces. A @i(programming) interface is a
143: library of routines and types provided in a programming language for
144: interacting with the window system. Both low-level (e.g., line drawing) and
145: high-level (e.g., menus) interfaces are typically provided. An @i(application)
146: interface is the mechanical interaction with the user and the visual appearance
147: that is specific to the application. A @i(management) interface is the
148: mechanical interaction with the user dealing with overall control of the
149: desktop and the input devices. The management interface defines how
150: applications are arranged and rearranged on the screen, and how the user
151: switches between applications; an individual application interface defines how
152: information is presented and manipulated within that application. The @i(user)
153: interface is the sum total of all application and management interfaces.
154:
155: Besides applications, we distinguish three major components of a window system.
156: The @i(window manager)@footnote( Some people use this term for what we call the
157: base window system; that is not the meaning here.) implements the desktop
158: portion of the management interface; it controls the size and placement of
159: application windows, and also may control application window attributes such as
160: titles and borders. The @i(input manager) implements the remainder of the
161: management interface; it controls which applications see input from which
162: devices (e.g., keyboard and mouse). The @i(base window system) is the
163: substrate on which applications, window managers, and input managers are built.
164:
165: In this paper we are concerned with the base window system of X, with the
166: facilities it provides to build applications and managers. The following
167: requirements on the base window system crystallized during the design of X (a
168: few were not formulated until late in the design process):
169:
170: @begin(enumerate)
171:
172: @begin(multiple)
173:
174: The system should be implementable on a variety of displays.
175:
176: The system should work with nearly any bitmap display, and a variety of input
177: devices. Our design focused on workstation-class display technology likely to
178: be available in a university environment over the next few years. At one end
179: of the spectrum is a simple frame buffer and monochrome monitor, driven
180: directly by the host CPU with no additional hardware support. At the other end
181: of the spectrum is a multi-plane display with color monitor, driven by a
182: high-performance graphics co-processor. Input devices such as keyboards, mice,
183: tablets, joysticks, light pens, and touch screens should be supported.
184:
185: @end(multiple)
186: @begin(multiple)
187:
188: Applications must be device independent.
189:
190: There are several aspects to device independence. Most importantly, it must
191: not be necessary to rewrite, recompile, or even relink an application for each
192: new hardware display. Nearly as important, every graphics function defined by
193: the system should work on virtually every supported display; the alternative,
194: which is to use GKS-style inquire operations@cite(gks) to determine the set of
195: implemented functions at run-time, leads to tedious case analysis in every
196: application, and to inconsistent user interfaces. A third aspect of device
197: independence is that, as far as possible, applications should not need dual
198: control paths to work on both monochrome and color displays.
199:
200: @end(multiple)
201: @begin(multiple)
202:
203: The system must be network transparent: an application running on one
204: machine must be able to utilize a display on some other machine. The two
205: machines should not have to have the same architecture or operating system.
206:
207: There are numerous examples of why this important: a compute-intensive VLSI
208: design program executing on a mainframe, but displaying results on a
209: workstation; an application distributed over several stand-alone processors,
210: but interacting with a user at a workstation; a professor running a program on
211: one workstation, presenting results simultaneously on all student workstations.
212:
213: In a network environment, there are certain to be applications that must run on
214: particular machines or architectures. Examples include proprietary software,
215: applications depending on specific architectural properties, and programs
216: manipulating large databases. Such applications still should be accessible to
217: all users. In a truly heterogeneous environment, not all programming languages
218: and programming systems are supported on all machines, and it is very
219: undesirable to have to write an interactive front end in multiple languages in
220: order to make the application generally available. With network-transparent
221: access, this is not necessary; a single front end written in the same language
222: as the application suffices.
223:
224: One might think that remote display will be extremely infrequent, and that
225: performance therefore is much less important than for local display.
226: Experience at MIT, however, indicates that many users routinely make use of the
227: remote display capabilities in X, and that the performance of remote display is
228: quite important. The desktop display, although physically connected to a
229: single computer, is used as a true @i(network virtual terminal); indeed, the
230: idea of an X server (see the next section) built into a Blit-like
231: terminal@cite(blit) is an intriguing one.
232:
233: @end(multiple)
234: @begin(multiple)
235:
236: The system must support multiple applications displaying concurrently.
237:
238: For example, it should be possible to display a clock with a sweep second hand
239: in one window, while simultaneously editing a file in another window.
240:
241: @end(multiple)
242: @begin(multiple)
243:
244: The system should be capable of supporting many different application and
245: management interfaces.
246:
247: No single user interface is "best"; different communities have radically
248: different ideas about user interfaces. Even within a single community,
249: "experts" and "novices" place different demands on an interface. Rather than
250: mandating a particular user interface, the base window system should support a
251: wide range of interfaces.
252:
253: To achieve this, the system must provide @i(hooks) (mechanism) rather than
254: @i(religion) (policy). For example, since menu styles and semantics vary
255: dramatically among different user interfaces, the base window system must
256: provide primitives from which menus can be built, rather than just providing a
257: fixed menu facility.
258:
259: The system should be designed in such a way that it is possible to implement
260: management policy both external to the base window system and external to
261: applications. Applications should be largely independent of management policy
262: and mechanism; applications should @i(react to) management decisions, rather
263: than @i(directing) those decisions. For example, an application needs to be
264: informed when one of its windows is resized, and should react by reformatting
265: the information displayed, but involvement of the application should not be
266: required in order for the user to change the size. Making applications
267: management-independent, as well as device-independent, facilitates the sharing
268: of applications between diverse cultures.
269:
270: @end(multiple)
271: @begin(multiple)
272:
273: The system must support overlapping windows, including output to partially
274: obscured windows.
275:
276: This is in some sense a by-product of the previous requirement, but is
277: important enough to merit explicit statement. Not all user interfaces allow
278: windows to overlap arbitrarily. However, even interfaces that do not allow
279: application windows to overlap typically provide some form of pop-up menu that
280: overlaps application windows. If such menus are built from windows, then
281: support for overlapping windows must exist.
282:
283: @end(multiple)
284: @begin(multiple)
285:
286: The system should support a hierarchy of resizable windows, and an application
287: should be able to use many windows at once.
288:
289: Subwindows provide a clean, powerful mechanism for exporting much of the basic
290: system machinery back to the application for direct use. Many applications
291: make use of their own window-like abstractions; some even implement what is
292: essentially another window system, nested within the "real" window system. It
293: is important to support arbitrary levels of nesting. What is viewed as a
294: single window at one abstraction level may well require multiple subwindows at
295: a lower level. By providing a true window hierarchy, application windows can
296: be implemented as true windows within the system, freeing the application from
297: duplicating machinery such as clipping and input control.
298:
299: @end(multiple)
300: @begin(multiple)
301:
302: The system should provide high-performance, high-quality support for text,
303: 2-D synthetic graphics, and imaging.
304:
305: The base window system must provide "immediate" or "transparent" graphics: the
306: application describes the image precisely, and the system does not attempt to
307: second-guess the application. The use of high-level models, whereby the
308: application describes @i(what) it wants in terms of fairly abstract objects and
309: the system determines @i(how) best to render the image, cannot be imposed as
310: the only form of graphics interface. Such models generally fail to provide
311: adequate support for some important class of applications, and different user
312: communities tend to have strong opinions about which model is "best".
313: High-level models are extremely important to provide, but they should be built
314: in layers on top of the base window system.
315:
316: Support for 3-D graphics is not listed as a requirement, but this is not to say
317: it is unimportant. We simply have not considered 3-D graphics, due to lack of
318: expertise and lack of time.
319:
320: @end(multiple)
321: @begin(multiple)
322: The system should be extensible.
323:
324: For example, the core system may not support 3-D graphics, but it should be
325: possible to extend the system with such support. The extension mechanism
326: should allow communities to extend the system non-cooperatively, yet allow such
327: independent extensions to be merged gracefully.
328:
329: @end(multiple)
330: @end(enumerate)
331:
332: We believe that a window system must satisfy these requirements to be a viable
333: standard in an environment of high-performance workstations and mainframes
334: connected via high-performance local area networks. X satisfies most of these
335: requirements, but currently fails to satisfy a few due to practical
336: considerations of staffing and time constraints: the design and much of the
337: implementation of the base window system was to be handled solely by the first
338: author; it was important to get a working system up fairly quickly; and the
339: immediate applications only required relatively simple text and graphics
340: support. As a result, X is not designed to handle high-end color displays or
341: to deal with input devices other than a keyboard and mouse; some support for
342: high-quality text and graphics is missing; X only provides support for one
343: class of management policy; and no provision has been made for extensions. As
344: discussed in Section 10, these and other problems are being addressed in a
345: redesign of X.
346:
347: @begin(fullpagefigure)
348: @blankspace(7 inches)
349: @caption(System Structure)
350: @end(fullpagefigure)
351:
352: @section(System Model)
353:
354: The X window system is based on a client-server model; this model follows
355: naturally from requirements two and three in the previous section. For each
356: physical display, there is a controlling server. A client application and a
357: server communicate over a reliable duplex (8-bit) byte stream. A simple block
358: stream protocol is layered on top of the byte stream. If the client and server
359: are on the same machine, the stream is typically based on a local inter-process
360: communication (IPC) mechanism, and otherwise a network connection is
361: established between the pair. Requiring nothing more than a reliable duplex
362: byte stream (without urgent data) for communication makes X usable in many
363: environments. For example, the X protocol can be used over TCP@cite(tcp),
364: DECnet@cite(decnet), and Chaos@cite(chaos).
365:
366: Multiple clients can have connections open to a server simultaneously, and a
367: client can have connections open to multiple servers simultaneously. The
368: essential tasks of the server are to multiplex requests from clients to the
369: display, and demultiplex keyboard and mouse input back to the appropriate
370: clients. Typically, the server is implemented as a single sequential process,
371: using round-robin scheduling among the clients, and this centralized control
372: trivially solves many synchronization problems; however, a multi-process server
373: has also been implemented. Although one might place the server in the kernel
374: of the operating system in an attempt to increase performance, a user-level
375: server process is vastly easier to debug and maintain, and performance under
376: Unix in fact does not seem to suffer. Similar performance results have been
377: obtained in Andrew@cite(wm). Various tricks are used in both clients and
378: server to optimize performance, principally by minimizing the number of
379: operating system calls@cite(hacks).
380:
381: The server encapsulates the base window system. It provides the fundamental
382: resources and mechanisms, and the hooks required to implement various user
383: interfaces. All device dependencies are encapsulated by the server; the
384: communication protocol between clients and the server is device independent.
385: By placing all device dependencies on one end of a network connection,
386: applications are truly device independent. The addition of a new display type
387: simply requires the addition of a new server implementation; no application
388: changes are required. Of course, the server itself is designed as device
389: independent code layered on top of a device dependent core, so only the "back
390: end" of the server need be reimplemented for each new display.@footnote( A back
391: end has been implemented using a programming interface to X itself, such that a
392: complete "recursive" X server executes inside a window of another X server.)
393:
394: @subsection(Network Considerations)
395:
396: It is extremely important for the server to be robust with respect to client
397: failures. The server, and the network protocol, must be designed so that the
398: server never trusts clients to provide correct data. As a corollary, the
399: protocol must be designed in such a way that, if the server ever has to wait
400: for a response from a client, it must be possible to continue servicing other
401: clients. Without this property, a buggy client or a network failure could
402: easily cause the entire display to freeze up.
403:
404: Byte ordering is a standard problem in network communication: when a 16-bit or
405: 32-bit quantity is transmitted over an 8-bit byte stream, is the most
406: significant byte transmitted first (big-endian byte order) or is the least
407: significant byte transmitted first (little-endian byte order)? Some machines
408: with byte-addressable memory use big-endian order internally, and others use
409: little-endian order. If a single order is chosen for network communication,
410: some machines will suffer the overhead of swapping bytes, even when
411: communicating with a machine using the same internal byte order. Such an
412: approach also means that both parties in the communication must worry about
413: byte order.
414:
415: The X protocol uses a different approach. The server is designed to accept
416: both big-endian and little-endian connections. For example, using TCP this is
417: accomplished by having the server listen on two distinct ports; little-endian
418: clients connect to the server on one port, and big-endian clients connect on
419: the other. Clients always transmit and receive in their native byte order.
420: The server alone is responsible for byte swapping, and byte swapping only
421: occurs between dissimilar architectures. This eliminates the byte swapping
422: overhead in the most common situations, and greatly simplifies the building of
423: client-side interface libraries in various programming languages. X is not
424: unique in its use of this trick; the current VGTS implementation uses the same
425: trick, and similar protocol optimizations have been used in various
426: network-based applications.
427:
428: Another potential problem in protocol design is word alignment. In particular,
429: some architectures require 16-bit quantities to be aligned on 16-bit boundaries
430: and 32-bit quantities to be aligned on 32-bit boundaries in memory. To allow
431: efficient implementations of the protocol across a spectrum of 16-bit and
432: 32-bit architectures, the protocol is defined to consist of blocks that are
433: always multiples of 32 bits, and each 16-bit and 32-bit quantity within a block
434: is aligned on 16-bit and 32-bit boundaries, respectively.
435:
436: X is designed to operate in an environment where the inter-process
437: communication round-trip time is between 5 and 50 milliseconds, both for local
438: and for network communication. We also assume that data transmission rates are
439: comparable to display rates; for example, to transmit and display 5000
440: characters per second, a data rate of approximately 50Kb (kilobits per second)
441: will be needed, and to transmit and display 20000 characters per second, a data
442: rate of approximately 200Kb will be needed. Networks and protocol
443: implementations with these characteristics are now quite commonplace. For
444: example, workstations running Berkeley Unix, connected via 10Mb (megabits per
445: second) local area networks, typically have round-trip times of 15 to 30
446: milliseconds, and data rates of 500Kb to 1Mb.
447:
448: The round-trip time is important in determining the form of the communication
449: protocol. The most common communication will be text and graphics requests
450: sent from a client to the server. Examples of individual requests might be to
451: draw a string of text or to draw a line. Such requests could be sent either
452: synchronously, in which case the client sends a request only after receiving a
453: reply from the server to the previous request, or they could be sent
454: asynchronously, without the server generating any replies. However, since the
455: requests are sent over a reliable stream, they are guaranteed to arrive, and
456: arrive in order, so replies from the server to graphics requests serve no
457: useful purpose. Moreover, with round-trip times over 5 milliseconds, output to
458: the display must be asynchronous, or it will be impossible to drive high-speed
459: displays adequately. For example, at 80 characters per request and a 25
460: millisecond round-trip time, only 3200 characters per second can be drawn
461: synchronously, whereas many hardware devices are capable of displaying between
462: 5000 and 30000 characters per second.
463:
464: Similarly, polling the server for keyboard and mouse input would be
465: unacceptable in many applications, particularly those written in sequential
466: languages. For example, an application attempting to provide real-time
467: response to input has to poll periodically for input during screen updates.
468: For an application with a single thread of control, this effectively results in
469: synchronous output, and consequent performance loss. Hence, input must be
470: generated asynchronously by the server, so that applications need at most
471: perform local polling.
472:
473: The round-trip time is also important in determining what user interfaces can
474: be supported without embedding them directly in the server. The most important
475: concern is whether remote, application-level mouse tracking is feasible. By
476: @i(tracking), we do not mean maintaining the cursor image on the screen as the
477: user moves the mouse; that function is performed autonomously by the X server,
478: often directly in hardware. Rather, applications track the mouse by animating
479: some other image on the screen in real time as the mouse moves. For round-trip
480: times under 50 milliseconds, tracking is perfectly reasonable, driven either by
481: motion events generated by the server or by continuous polling from the
482: application. With a refresh occurring up to 30 times every second, remote
483: tracking is demonstrably "instantaneous" with mouse motion.
484:
485: For tracking to be effective, however, relatively little time can be spent
486: updating the display at each movement, so typically only relatively small
487: changes can be made to the screen while tracking. This is certainly the case
488: for common operations, such as rubber banding window outlines and highlighting
489: menu items. It might be argued that the ability to run application-specific
490: code in the server is required for acceptable hand-eye coordination during
491: complex tracking. For example, NeWS@cite(news) provides such a mechanism in a
492: novel way. However, we are not convinced there are sufficient benefits to
493: justify such complexity. Complex tracking typically is bound up intimately
494: with application-specific data structures and knowledge representations, and
495: such information is used by the "back end" of the application as well as the
496: "front end". In a distributed system it is folly to believe that applications
497: will download large front ends into a server; communication round-trip times
498: are a reality that cannot be escaped.
499:
500: @subsection(Resources)
501:
502: The basic resources provided by the server are windows, fonts, mouse cursors,
503: and off-screen images; later sections describe each of these. Clients request
504: creation of a resource by supplying appropriate parameters (such as the name of
505: the font); the server allocates the resource and returns a 31-bit unique
506: identifier used to represent it. The use and interpretation of a resource
507: identifier is independent of any network connection. Any client that knows (or
508: guesses) the identifier for a resource can use and manipulate the resource
509: freely, even if it was created by another client. This capability is required
510: to allow window managers to be written independently of applications, and to
511: allow multi-process applications to manipulate shared resources. However, to
512: avoid problems associated with clients that fail to clean up their resources at
513: termination (which is all too common in operating systems where users can
514: unilaterally abort processes), the maximum lifetime of a resource is always
515: tied to the connection over which it was created. Thus, when a client
516: terminates, all of the resources it created are destroyed automatically.
517:
518: Access control is performed only when a client attempts to establish a
519: connection to the server; once the connection is established the client can
520: freely manipulate any resource. Since accidental manipulation of some other
521: client's resource is extremely unlikely (both in theory and in practice), we
522: believe introducing access control on a per-resource basis would only serve to
523: decrease performance, not to significantly increase security or robustness.
524: The current access control mechanism is based simply on host network addresses,
525: as this information is provided by most network stream protocols, and there
526: seems to be no widely used or even widely available user-level authentication
527: mechanism. Host-based access control has proven to be marginally acceptable in
528: a workstation environment, but is rather unacceptable for time-shared
529: machines.@footnote( It is interesting that @i(professors) at MIT have argued
530: vociferously to disable all access control.)
531:
532: Each client-generated protocol request is a simple data block consisting of an
533: opcode, some number of fixed-length parameters, and possibly a variable-length
534: parameter. For example, to display text in a window, the fixed-length
535: parameters include the drawing color and the identifiers for the window and the
536: font, and the variable-length parameter is the string of characters. All
537: operations on a resource explicitly contain the identifier of the resource as a
538: parameter. In this way, an application can multiplex use of many windows over
539: a single network connection. This multiplexing makes it easy for the client to
540: control the time-order of updates to multiple windows. Similarly, each input
541: event generated by the server contains the identifier of the window in which
542: the event occurred. Multiplexing over a single stream allows the client to act
543: on events from multiple windows in correct time order; timestamps alone are
544: inadequate without strong guarantees from the stream mechanism.
545:
546: Numerous Unix-based window
547: systems@cite(masscomp,andrew,sapphire,pnx,sunwin,mg1,metheus) use file or
548: channel descriptors to represent windows; window creation involves an
549: interaction with the operating system, which results in the creation of such a
550: descriptor. Typically, this means the window cannot be named (and hence cannot
551: be shared) by programs running on different machines, and perhaps not even by
552: programs running on the same machine. More serious, there is often a severe
553: restriction on the number of active descriptors a process may have: 20 on
554: older systems and usually 64 on newer systems. The use of 50 or more windows
555: (albeit nested inside a single top-level window) is quite common in X
556: applications. The use of a single connection, over which an arbitrary number
557: of windows can be multiplexed, is clearly a better approach.
558:
559: @section(Window Hierarchy)
560:
561: The server supports an arbitrarily branching hierarchy of rectangular windows.
562: At the top is the @i(root) window, which covers the entire screen. The
563: @i(top-level) windows of applications are created as subwindows of the root
564: window. The window hierarchy models the now-familiar "stacks of papers"
565: desktop. For a given window, its subwindows can be stacked in any order, with
566: arbitrary overlaps. When window W1 partially or completely covers window W2,
567: we say that W1 @i(obscures) W2. This relationship is not restricted to
568: siblings; if W1 obscures W2, then W1 may also obscure subwindows of W2. A
569: window also obscures its parent. Window hierarchies never interleave; if
570: window W1 obscures sibling window W2, then subwindows of W2 never obscure W1 or
571: subwindows of W1. A window is not restricted in size or placement by the
572: boundaries of its parent, but a window is always visibly clipped by its parent:
573: portions of the window that extend outside the boundaries of the parent are
574: never displayed, and do not obscure other windows. Finally, a window can be
575: either @i(mapped) or @i(unmapped). An unmapped window is never visible on the
576: screen; a mapped window can only be visible if all of its ancestors are also
577: mapped.
578:
579: Output to a leaf window (one with no subwindows) is always clipped to the
580: visible portions of the window; drawing on such a window never draws into
581: obscuring windows. Output to a window that contains subwindows can be
582: performed in two modes. In @i(clipped) mode the output is clipped normally by
583: all obscuring windows (including subwindows), but in @i(draw-through) mode the
584: output is not clipped by subwindows. For example, draw-through mode is used on
585: the root window during window management, tracking the mouse with the outline
586: of a window to indicate how the window is to be moved or resized. If clipped
587: mode were used instead, the entire outline would not be visible.
588:
589: The coordinate system is defined with the X axis horizontal and the Y axis
590: vertical. Each window has its own coordinate system, with the origin at the
591: upper left corner of the window. Having per-window coordinate systems is
592: crucial, particularly for top-level windows; applications are almost always
593: designed to be insensitive to their position on the screen, and having to worry
594: about race conditions when moving windows would be a disaster. The coordinate
595: system is discrete: each pixel in the window corresponds to a single unit in
596: the coordinate system, with coordinates centered on the pixels, and all
597: coordinates are expressed as integers in the protocol. We believe fractional
598: coordinates are not required at the protocol level for the raster graphics
599: provided in X (see section 6), although they may be required for high-end color
600: graphics, such as anti-aliasing. The aspect ratio of the screen is not masked
601: by the protocol, since we believe that most displays have a one to one aspect
602: ratio; in this regard X is arguably device dependent.
603:
604: Although the coordinate system is discrete at the protocol level, continuous or
605: alternate-origin coordinate systems certainly can be used at the application
606: level, but client-side libraries must eventually translate to the discrete
607: coordinates defined by the protocol. In this way, we can ignore the many
608: variations in floating-point (or even fixed-point) formats among architectures.
609: Further, the coordinates can be expressed in the protocol as 16-bit quantities,
610: which can be manipulated efficiently in virtually every machine/display
611: architecture, and which minimizes the number of data bytes transmitted over the
612: network. The use of 16-bit quantities does have a drawback, in that some
613: applications (particularly CAD tools) like to perform zoom operations simply by
614: scaling coordinates and redrawing, relying on the window system to clip
615: appropriately. Since scaling quickly overflows 16 bits, additional clipping
616: must be performed explicitly by such applications.
617:
618: A window can optionally have a @i(border), a shaded outer frame maintained
619: explicitly by the X server. The origin of the window's coordinate system is
620: inside the border, and output to the window is clipped automatically so as not
621: to extend into the border. The presence of borders slightly complicates the
622: semantics of the window system; for simplicity we will ignore them in the
623: remainder of this paper.
624:
625: The basic operations on window structure are straightforward. An unmapped
626: window is created by specifying the parent window, the position within the
627: parent of the upper left corner of the new window, and the width and height (in
628: coordinate units) of the new window. A window can be destroyed, in which case
629: all windows below it in the hierarchy are also destroyed. A window can be
630: mapped and unmapped, without changing its position. A window can be moved and
631: resized, including being moved and resized simultaneously. A window can also
632: be "depthwise" raised to the top or lowered to the bottom the stack with
633: respect to its siblings, without changing its coordinate position. Currently
634: mapping or configuring a window forces the window to be raised. This
635: restriction appeared to simplify the server implementation, but also happened
636: to match the basic management interface we expected to build. This restriction
637: will be eliminated in the next version.
638:
639: The windows described above are the usual @i(opaque) windows. X also provides
640: @i(transparent) windows. A transparent window is always invisible on the
641: screen, and does not obscure output to, or visibility of, other windows.
642: Output to a transparent window is clipped to that window, but is actually drawn
643: on the parent window. Thus, for output, a transparent window is simply a
644: clipping rectangle that can be applied to restrict output within a (parent)
645: window. Input processing for transparent and opaque windows is identical, as
646: described in Section 8. In Section 10 we will argue that most uses of
647: transparent windows are better satisfied with other mechanisms. Therefore, for
648: simplicity, we will ignore transparent windows in the rest of this paper.
649:
650: The X server is designed explicitly to make windows inexpensive. Our goal was
651: to make it reasonable to use windows for such things as individual menu items,
652: buttons, even individual items in forms and spreadsheets. As such, the server
653: must deal efficiently with hundreds (though not necessarily thousands) of
654: windows on the screen simultaneously. Experience with X has shown that many
655: implementors find this capability extremely useful.
656:
657: @section(Color)
658:
659: The screen is viewed as two dimensional, with an N-bit @i(pixel) value stored
660: at each coordinate. The number of bits in a pixel value, and how a value
661: translates into a color, depends on the hardware. X is designed to support two
662: types of hardware: monochrome and pseudo-color. A monochrome display has one
663: bit per pixel, and the two values translate into black and white. Pseudo-color
664: displays typically have between four and twelve bits per pixel; the pixel value
665: is used as an index into a color map, yielding red, green, and blue
666: intensities. The color map can be changed dynamically, so that a given pixel
667: value can represent different colors over time. Gray-scale is viewed as a
668: degenerate case of pseudo-color.
669:
670: We desire a design matching most display hardware, while abstracting
671: differences in such a way that programmers do not have to double or triple-code
672: their applications to cover the spectrum. We also want multiple applications
673: to coexist within a single color map, so that applications always show true
674: color on the screen. To allow this, and to keep applications device
675: independent, pixel values should not be coded explicitly into applications.
676: Instead, the server must be responsible for managing the color map, and color
677: map allocation must be expressed in hardware-independent terms.
678:
679: All graphics operations in X are expressed in terms of pixel values. For
680: example, to draw a line, one specifies not only the coordinates of the
681: end-points but the pixel value with which to draw the line. (Logic functions
682: and plane-select masks are also specified, as described in Section 6.) On a
683: monochrome display, the only two pixel values are zero and one, which are
684: (somewhat arbitrarily) defined to be black and white, respectively. On a
685: pseudo-color display, pixel values zero and one are pre-allocated by the
686: server, for use as "black" and "white", so that monochrome applications display
687: correctly on color displays. Of course, the actual colors need not be black
688: and white, but can be set by the user.
689:
690: There are two ways for a client to obtain pixel values. In the simplest
691: request, the client specifies red, green, and blue color values, and the server
692: allocates an arbitrary pixel value and sets the color map so the pixel value
693: represents the closest color the hardware can provide. The color map entry for
694: this pixel value cannot be changed by the client, so if some other client
695: requests an equivalent color, the server is free to respond with the same pixel
696: value. Such sharing is important in maximizing use of the color map. To
697: isolate applications from variations in color representation among displays
698: (due, for example, to the standard of illumination used for calibration), the
699: server provides a color database which clients can use to translate string
700: names of colors into red, green, and blue values tailored for the particular
701: display.
702:
703: The second request allocates writable map entries. This mechanism was designed
704: explicitly for X; we are not aware of a comparable mechanism in any other
705: window system. The client specifies two numbers, @i(C) and @i(P), with @i(C)
706: positive and @i(P) non-negative; the request can be expressed as "allocate
707: @i(C) colors and @i(P) planes". The total number of pixel values allocated by
708: the server is @i(C*2@+(P)). The values passed back to the client consist of
709: @i(C) base pixel values, and a plane mask containing @i(P) bits. None of the
710: base pixel values have any one bits in common with the plane mask, and the
711: complete set of allocated pixel values is obtained by combining all possible
712: combinations of one bits from the plane mask with each of the base pixel
713: values. The client can optionally require the @i(P) planes to be contiguous,
714: in which case all @i(P) bits in the plane mask will be contiguous.
715:
716: There are three common uses of this second request. One is simply to allocate
717: a number of "unrelated" pixel values; in this case, @i(P) will be zero. A
718: second use is in imaging applications, where it is convenient to be able to
719: perform simple arithmetic on pixel values. In this case, a contiguous block of
720: pixel values is allocated by setting @i(C) to one and @i(P) to the log (base 2)
721: of the number of pixel values required, and requesting contiguous allocation.
722: Arithmetic on the pixel values then requires at most some additional shift and
723: mask operations.
724:
725: A third form of allocation arises in applications that want some form of
726: overlay graphics, such as highlighting or outlining regions. Here the
727: requirement is to be able to draw and then erase graphics without disturbing
728: existing window contents. For example, suppose an application typically uses
729: four colors, but needs to be able to overlay a rectangle outline in a fifth
730: color. An allocation request with C set to four and P set to one results in
731: two groups of four pixel values. The four base pixel values are assigned the
732: four normal colors, and the four alternate pixel values are all assigned the
733: fifth color. Overlay graphics can then be drawn by restricting output (see the
734: next section) to the single bit plane specified in the mask returned by the
735: color allocation. Turning bits in this plane on (to ones) changes the image to
736: the fifth color, and turning them off reverts the image to its original color.
737:
738: @section(Graphics and Text)
739:
740: Graphics operations are often the most complex part of any window system,
741: simply because so many different effects and variations are required to satisfy
742: a wide range of applications. In this section we sketch the operations
743: provided in X, so that the basic level of graphics support can be understood.
744: The operations are essentially a subset of the Digital Workstation Graphics
745: Architecture; the VS100 display@cite(vs100) implements this architecture for
746: 1-bit pixel values. The set of operations purposely was kept simple, in order
747: to maximize portability.
748:
749: Graphics operations in X are expressed in terms of relatively high-level
750: concepts, such as lines, rectangles, curves, and fonts. This is in contrast to
751: systems in which the basic primitives are to read and write individual pixels.
752: Basing applications on pixel-level primitives works well when display memory
753: can be mapped into the application's address space for direct manipulation.
754: However, both display hardware and operating systems exist for which such
755: direct access is not possible, and emulating pixel-level manipulations in such
756: an environment results in extremely poor performance. Expressing operations at
757: a higher level avoids such device dependencies, and also avoids potential
758: problems with network bandwidth. With high-level operations, a protocol
759: request transmitted as a small number of bits over the network typically
760: affects ten to one hundred times as many pixels on the screen.
761:
762: @subsection(Images)
763:
764: Two forms of off-screen images are supported in X: bitmaps and pixmaps. A
765: bitmap is a single plane (bit) rectangle. A pixmap is an N-plane (pixel)
766: rectangle, where @i(N) is the number of bits per pixel used by the particular
767: display. A bitmap or pixmap can be created by transmitting all of the bits to
768: the server; a pixmap can also be created by copying a rectangular region of a
769: window. Bitmaps and pixmaps of arbitrary size can be created. Transmitting
770: very large (or deep) images over a network connection can be quite slow;
771: however, the ability to make use of shared memory in conjunction with the IPC
772: mechanism would help enormously when the client and server are on the same
773: machine.
774:
775: The primary use of bitmaps is as masks (clipping regions). Several graphics
776: requests allow a bitmap to be used as a clipping region@cite(warnock). Bitmaps
777: are also used to construct cursors, as described in Section 8. Pixmaps are
778: used to store frequently drawn images, and as temporary backing-store for
779: pop-up menus (as described in Section 8). However, the principal use of
780: pixmaps is as tiles, that is, as patterns which are replicated in two
781: dimensions to cover a region. Since there are often hardware restrictions as
782: to what tile shapes can be replicated efficiently, guaranteed shapes are not
783: defined by the X protocol. An application can query the server to determine
784: what shapes are supported, although to date most applications simply assume 16
785: by 16 tiles are supported. A better semantics is to support arbitrary shapes,
786: but allow applications to query as to which shapes are most efficient.
787:
788: The tiling origin used in X is almost always the origin of the destination
789: window. That is, if enough tiles were laid out, one tile would have its upper
790: left corner at the upper left corner of the window. In this way, the contents
791: of the window are independent of the window's position on the screen, and the
792: window can be moved transparently to the application.
793:
794: Servers vary widely in the amount of off-screen memory provided. For example,
795: some servers limit off-screen memory to that accessible directly to the
796: graphics processor (typically one to three times the size of screen memory),
797: and fonts and other resources are allocated from this same pool. Other servers
798: utilize their entire virtual address space for off-screen memory. Since
799: off-screen memory for images is finite, an explicit part of the X protocol is
800: the possibility that bitmap or pixmap creation can fail. Depending on the
801: intended use of the image, the application may or may not be able to cope with
802: the failure. For example, if the image was being stored simply to speed up
803: redisplay, the application can always transmit the image directly each time
804: (see below). If the image was to be a temporary backing-store for a window,
805: the application can fall back on normal exposure processing (as described in
806: Section 7). Servers should be constructed in such a way as to virtually
807: guarantee sufficient memory (e.g., by caching images) for creating at least
808: small tiles and cursors, although this is not true in current implementations.
809:
810: @subsection(Graphics)
811:
812: All graphics and text requests include a logic function and a plane-select mask
813: (an integer with the same number of bits as a pixel value) to modify the
814: operation. All sixteen logic functions are provided. Given a source and
815: destination pixel, the function is computed bitwise on corresponding bits of
816: the pixels, but only on bits specified in the plane-select mask. Thus the
817: result pixel is computed as
818: @begin(format, leftmargin +5)
819: ((source FUNC destination) AND mask) OR (destination AND (NOT mask))
820: @end(format)
821: The most common operation is simply replacing the destination with the source in
822: all planes.
823:
824: The simplest graphics request takes a single source pixel value and combines it
825: with every pixel in a rectangular region of a window. Typically this is used
826: to fill a region with a color, but by varying the logic function or masks,
827: other effects can be achieved. A second request takes a tile, effectively
828: constructs a tiled rectangular source with it, and then combines the source
829: with a rectangular region of a window.
830:
831: An arbitrary image can be displayed directly, without first being stored
832: off-screen. For monochrome images, the full contents of a bitmap are
833: transmitted, along with a pair of pixel values; the image is displayed in a
834: region of a window with those two colors. For color images, the full contents
835: of a pixmap can be transmitted and displayed. In order to avoid inordinate
836: buffer space in the server, very large images must be broken into sections on
837: the client side and displayed in separate requests.
838:
839: The CopyArea request allows one region of a window to be moved to (or combined
840: with) another region of the same window. This is the usual @i(bitblt), or "bit
841: block transfer" operation. The source and destination are given as rectangular
842: regions of the window; the two regions have the same dimensions. The operation
843: is such that overlap of the source and destination does not affect the result.
844:
845: X provides a complex primitive for line drawing. It provides for arbitrary
846: combinations of straight and curved segments, defining both open and closed
847: shapes. Lines can be @i(solid), by drawing with a single source pixel value,
848: @i(dashed), by alternately drawing with a single source pixel value and not
849: drawing, and @i(patterned), by alternately drawing with two source pixel
850: values. Lines are drawn with a rectangular brush. Clients can query the
851: server to determine what brush shapes are supported; a better semantics would
852: be to support arbitrary shapes, but allow applications to query as to which
853: shapes are most efficient.
854:
855: A final request allows an arbitrary closed shape (such as could be specified in
856: the line drawing request) to be filled with either a single source pixel value
857: or a tile. For self-intersecting shapes, the even-odd rule is used: a point is
858: inside the shape if an infinite ray with the point as origin crosses the path
859: an odd number of times.
860:
861: @subsection(Text)
862:
863: For high-performance text, X provides direct support for bitmap fonts. A font
864: consists of up to 256 bitmaps; each bitmap in a font has the same height but
865: can vary in width. To allow server-specific font representations, clients
866: "create" fonts by specifying a name rather than by downloading bitmap images
867: into the server. An application can use an arbitrary number of fonts, but (as
868: with all resources) font allocation can fail for lack of memory. A reasonably
869: implemented server should support an essentially unbounded number of fonts
870: (e.g., by caching), but some existing server implementations are deficient in
871: this respect. Unlike Andrew@cite(wm), no heuristics are applied by the server
872: when resolving a name to a font; specific communities or applications may
873: demand a variety of heuristics, and as such they belong outside the base window
874: system. Also unlike Andrew, the X server is not free to dynamically substitute
875: one font for another; we do not believe such behavior is necessary or
876: appropriate.
877:
878: A string of text can be displayed using a font either as a mask or as a source.
879: Using a font as a mask, the foreground (the one bits in the bitmap) of each
880: character is drawn with a single source pixel value. Using a font as a source,
881: the entire image of each character is drawn, using a pair of pixel values.
882: Source font output is provided specifically for applications using fixed-width
883: fonts in emulating traditional terminals.
884:
885: To support "cut and paste" operations between applications, the server provides
886: a number of buffers into which a client can read and write an arbitrary string
887: of bytes. (This mechanism was adopted from Andrew.) Although these buffers
888: are used principally for text strings, the server imposes no interpretation on
889: the data, so cooperating applications can use the buffers to exchange such
890: things as resource identifiers and images.
891:
892: @section(Exposures)
893:
894: Given that output to obscured windows is possible, the issue of @i(exposure)
895: must be addressed. When all (or a piece) of an obscured window again becomes
896: visible (for example, as the result of the window being raised), is the client
897: or the server responsible for restoring the contents of the window? In X, it
898: is the responsibility of the client. When a region of a window becomes
899: exposed, the server sends an asynchronous event to the client, specifying the
900: window and the region that has been exposed; the rest is up to the application.
901: A trivial application might simply redraw the entire window; a more
902: sophisticated application would only redraw the exposed region.
903:
904: Why is the client responsible? Because X imposes no structure on, or
905: relationships between, graphics operations from a client, there are only two
906: basic mechanisms by which the server might restore window contents: by
907: maintaining display lists, and by maintaining off-screen images. In the first
908: approach, the server essentially retains a list of all output requests
909: performed on the window. When a region of the window becomes exposed, the
910: server either re-executes all requests to the entire window, or only
911: re-executes requests that affect the region while clipping the output to that
912: region. In the alternative approach, when a window becomes obscured the server
913: saves the obscured region (or perhaps the entire window) in off-screen memory.
914: All subsequent output requests are executed not only to the visible regions of
915: the window, but to the off-screen image as well. When an obscured region
916: becomes visible again, the off-screen copy is simply restored.
917:
918: We believe neither server-based approach is acceptable. With display lists,
919: the server is unlikely to have any reasonable notion of when later output
920: requests nullify earlier ones. Either the display list becomes unmanageably
921: long, and a refresh that should appear nearly instantaneous instead appears as
922: a slow-motion replay, or the server spends a significant length of time pruning
923: the display list, and normal-case performance is considerably reduced. One
924: problem with the off-screen image approach is (virtual) memory consumption: on
925: a 1024 by 1024 8-plane display, just one full-screen image requires one
926: megabyte of storage, and multiple overlapping windows could easily require many
927: times that amount. Another problem is that the cost of the implementation can
928: be prohibitive. Consider, for example, the QDSS display@cite(qdss), which has
929: a graphics co-processor. In the QDSS, display memory is inaccessible to the
930: host processor. In addition, the co-processor cannot perform operations in
931: host memory, and has relatively little off-screen memory of its own. The only
932: viable way to maintain off-screen images for displays like the QDSS may be to
933: emulate the co-processor in software. It can easily take tens of thousands of
934: lines of code to emulate a co-processor, and such emulation may execute orders
935: of magnitude slower than the co-processor.
936:
937: Our belief is that many applications can take advantage of their own
938: information structures to facilitate rapid redisplay, without the expense of
939: maintaining a distinct display structure or backing-store in the client or the
940: server, and often with even better performance. (Sapphire@cite(sapphire)
941: permits client refresh for this reason.) For example, a text editor can
942: redisplay directly from the source, and a VLSI editor can redisplay directly
943: from the layout and component definitions. Many applications will be built on
944: top of high-level graphics libraries that automatically maintain the data
945: structures necessary to implement rapid redisplay. For example, the structured
946: display file mechanism in VGTS could be supported in a client library. Of
947: course, pushing the responsibility back on the application may not simplify
948: matters, particularly when retrofitting old systems to a new environment. For
949: example, the current GKS design does not provide adequate hooks for automatic,
950: system-generated refresh of application windows, nor does it provide an
951: adequate mechanism for forcing refresh back on the application.
952:
953: Relying on client-controlled refresh also derives from window management
954: philosophy. Our belief is that applications cannot be written with fixed
955: top-level window sizes built in. Rather, they must function correctly with
956: almost any size, and continue to function correctly as windows are dynamically
957: resized. This is necessary if applications are to be usable on a variety of
958: displays under a variety of window management policies. (Of course, an
959: application may need a minimum size to function reasonably, and may prefer the
960: width or height to be a multiple of some number; X allows the client to attach
961: a resize hint to each window to inform window managers of this.) Our belief is
962: that most applications, for one reason or another, will already have code for
963: performing a complete redisplay of the window, and that it is usually
964: straightforward to modify this code to deal with partial exposures. Similar
965: arguments were used in the design of both Andrew and Mex, and experience has
966: confirmed their decision@cite(wm,mex).
967:
968: This is not to argue that the server should never maintain window contents,
969: only that it should not be @i(required) to maintain contents. For complex
970: imaging and graphics applications, efficient maintenance by the server may be
971: critical for acceptable performance of window management functions. There is
972: nothing inherent in the X protocol that precludes the server from maintaining
973: window contents and not generating exposure events. In the next version of X,
974: windows will have several attributes to advise the server as to when and how
975: contents should be maintained.
976:
977: In X, clients are never informed of what regions are obscured, only of what
978: regions have become visible. Thus, clients have insufficient information to
979: try and optimize output by only drawing to visible regions. However, we feel
980: this is justified on two grounds. First, realistically, users seldom stack
981: windows such that the active ones are obscured, so there is little point in
982: complicating applications to optimize this case. More importantly, allowing
983: applications to restrict output to only visible regions would conflict with the
984: desire to have the server maintain obscured regions automatically when
985: possible.
986:
987: An interesting complication with the CopyArea request (described in Section 6)
988: arises, having decided on client refresh. If part of the source region of the
989: CopyArea is obscured, then not all of the destination region can be updated
990: properly, and the client must be notified (with an exposure event) so that it
991: can correct the problem. Since output requests are asynchronous, care must be
992: taken by the application to handle exposure events when using CopyArea. In
993: particular, if a region is exposed and an event sent by the server, a
994: subsequent CopyArea may move all or part of the region before the event is
995: actually received by the application. Several simple algorithms have been
996: designed to deal with this situation, but we will not present them here.
997:
998: Client refresh raises a visual problem in a network environment. When a region
999: of a window becomes exposed, what contents should the server initially place in
1000: that window? In a local, tightly-coupled environment, it might be perfectly
1001: reasonable to leave the contents unaltered, because the client can almost
1002: instantaneously begin to refresh the region. In a network environment however
1003: (and even in a local system where processes can get "swapped out" and take
1004: considerable time to swap back in), inevitable delays can lead to visually
1005: confusing results. For example, the user may move a window, and see two images
1006: of the window on the screen for a significant length of time, or resize a
1007: window and see no immediate change in the appearance of the screen.
1008:
1009: To avoid such anomalies in X, clients must define a @i(background) for every
1010: window. The background can be a single color, or it can be a tiling pattern.
1011: Whenever a region of a window is exposed, the server immediately paints the
1012: region with the background. Users therefore see window shapes immediately,
1013: even if the "contents" are slow to arrive. Of course, many application windows
1014: have some notion of a background anyway, so having the server initialize with a
1015: background seldom results in extraneous redisplay. In fact, many non-leaf
1016: windows typically contain nothing but a background, and having the server paint
1017: that background frees the applications from performing any redisplay at all to
1018: those windows.
1019:
1020: Although we believe client-generated refresh is acceptable most of the time, it
1021: does not always perform well with momentary pop-up menus, where speed is at a
1022: premium. To avoid potentially expensive refresh when a menu is removed from
1023: the screen, a client can explicitly copy the region to be covered by the menu
1024: into off-screen memory (within the server) before mapping the menu window. A
1025: special unmap request is used to remove the menu: it unmaps the window without
1026: affecting the contents of the screen or generating exposure events. The
1027: original contents are then copied back onto the screen. In addition, the
1028: client usually @i(grabs) the server for the entire sequence, using a request
1029: which freezes all other clients until a corresponding ungrab request is issued
1030: (or the grabbing client terminates). Without this, concurrent output from
1031: other clients to regions obscured by the menu would be lost. Although freezing
1032: other clients is in general a poor idea, it seems acceptable for momentary
1033: menus.
1034:
1035: @section(Input)
1036:
1037: We now turn to a discussion of input events, but first we briefly describe the
1038: support for mouse cursors. Clients can define arbitrary shapes for use as
1039: mouse cursors. A cursor is defined by a source bitmap, a pair of pixel values
1040: with which to display the bitmap, a mask bitmap which defines the precise shape
1041: of the image, and a coordinate within the source bitmap which defines the
1042: "center" or "hot spot" of the cursor. Cursors of arbitrary size can be
1043: constructed, although only a portion of the cursor may be displayed on some
1044: hardware. Clients can query the server to determine what cursor sizes are
1045: supported, but existing applications typically just assume a 16 by 16 image can
1046: always be displayed. Cursors also can be constructed from character images in
1047: fonts; this provides a simple form of named indirection, allowing custom
1048: tailoring to each display without having to modify the applications.
1049:
1050: A window is said to @i(contain) the mouse if the hot spot of the cursor is
1051: within a visible portion of the window or one of its subwindows. The mouse is
1052: said to be @i(in) a window if the window contains the mouse but no subwindow
1053: contains the mouse. Every window can have a mouse cursor defined for it. The
1054: server automatically displays the cursor of whatever window the mouse is
1055: currently in; if the window has no cursor defined, the server displays the
1056: cursor of the closest ancestor with a cursor defined.
1057:
1058: Input is associated with windows. Input to a given window is controlled by a
1059: single client, which need not be the client that created the window. Events
1060: are classified into various types, and the controlling client selects which
1061: types are of interest to it. Only events matching in type with this selection
1062: are sent to the client. When an input event is generated for a window and the
1063: controlling client has not selected that type, the server @i(propagates) the
1064: event to the closest ancestor window for which some client has selected the
1065: type, and sends the event to that client instead. Every event includes the
1066: window that had the event type selected; this window is called the @i(event
1067: window). If the event has been propagated, the event also includes the next
1068: window down in the hierarchy between the event window and the original window
1069: on which the event was generated.
1070:
1071: @subsection(The Keyboard)
1072:
1073: For the keyboard, a client can selectively receive events on the press or
1074: release of a key. Keyboard events are not reported in terms of ASCII character
1075: codes; instead, each key is assigned a unique code, and client software must
1076: translate these codes into the appropriate characters. The mapping from
1077: keycaps to keycodes is intended to be "universal" and predefined; a given
1078: keycap has the same keycode on all keyboards. Applications generally have been
1079: written to read a "keymap file" from the user's home directory, so that users
1080: can remap the keyboard as they see fit.
1081:
1082: The use of coded keys is secondary to the ability to detect both up and down
1083: transitions on the keyboard. For example, a common trick in window systems is
1084: for mouse button operations to be affected by keyboard @i(modifiers) such as
1085: the Shift, Control, and Meta keys. A useful feature of the Genera@cite(genera)
1086: system is the use of a "mouse documentation line", which changes dynamically as
1087: modifiers are pressed and released, indicating the function of the mouse
1088: buttons. A base window system must provide this capability. Transitions are
1089: not only useful on modifiers; various applications for systems other than X
1090: have been designed to use "chords" (groups of keys pressed simultaneously), and
1091: again the window system should support them.
1092:
1093: The keyboard is always @i(attached) to some window (typically the root window
1094: or a top-level window); we call this window the @i(focus) window. A request
1095: can be used (usually by the input manager) to attach the keyboard to any
1096: window. The window that receives keyboard input depends on both the mouse
1097: position and the focus window. If the mouse is in some descendant of the focus
1098: window, that descendant receives the input. If the mouse is not in a
1099: descendant of the focus window, then the focus window receives the input, even
1100: if the mouse is outside the focus window. For applications that wish to have
1101: the mouse state modify the effect of keyboard input, a keyboard event contains
1102: the mouse coordinates, both relative to the event window and global to the
1103: screen, as well as the state of the mouse buttons.
1104:
1105: To provide a reasonable user interface, keyboard events also contain the state
1106: of the most common modifier keys: Shift, ShiftLock, Control, and Meta.
1107: Without this information, anomalous behavior can result. If the user switches
1108: windows while modifier keys are down, the new client must somehow determine
1109: which modifiers are down. Placing the modifier state in the keyboard events
1110: solves such problems, and also has another benefit: most clients do not have
1111: to maintain their own shadow of the modifier state, and so often can completely
1112: ignore key release events. However, there is a conflict between this
1113: server-maintained state and client-maintained keyboard mappings. In
1114: particular, clients cannot use non-standard keys as modifiers, or use chords
1115: without the possibility of anomalies such as described above. We believe the
1116: correct solution (not yet supported in X) is for the server to maintain a bit
1117: mask reflecting the full state of the keyboard, and to allow clients to read
1118: this mask. An application using chords or non-standard modifiers would request
1119: the server to send this mask automatically whenever the mouse entered the
1120: application's window.
1121:
1122: @subsection(The Mouse)
1123:
1124: The X protocol is (somewhat arbitrarily) designed for mice with up to three
1125: buttons. An application can selectively receive events on the press or release
1126: of each button. Each event contains the current mouse coordinates (both local
1127: to the window and global to the screen), the current state of all buttons and
1128: modifier keys, and a timestamp which can be used, for example, to decide when a
1129: succession of clicks constitutes a double or triple click. An application can
1130: also choose to receive mouse motion events, either whenever the mouse is in the
1131: window, or only when particular buttons have also been pressed. The
1132: application cannot control the granularity of the reporting, nor is any minimum
1133: granularity guaranteed. In fact, typical server implementations make an effort
1134: to compact motion events, to minimize system overhead and wired memory in
1135: device drivers. As such, X may not serve adequately for fine-grained tracking,
1136: such as in fast moving free-hand drawing applications.
1137:
1138: Even with motion compaction, servers can generate considerable numbers of
1139: motion events. If an application attempts to respond in real time to every
1140: event, it can easily get far behind relative to the actual position of the
1141: mouse. Instead, many applications simply treat motion events as hints. When a
1142: motion event is received, the event is simply discarded, and the client then
1143: explicitly queries the server for the current mouse position. In waiting for
1144: the reply, more motion events may be received; these are also discarded. The
1145: client then reacts based on the queried mouse position. The advantage of this
1146: scheme over continuously polling the mouse position is that no CPU time is
1147: consumed while the mouse is stationary.
1148:
1149: Clients can also receive an event each time the mouse enters or leaves a
1150: window. This can be particularly useful in implementing menus. For example,
1151: each menu item can be placed in a separate subwindow of the overall menu
1152: window. When the mouse enters a subwindow, the item is highlighted in some
1153: fashion (e.g., by inverting the video sense), and when the mouse leaves the
1154: window the item is restored to normal. Implementing a menu in this manner
1155: requires considerably less CPU overhead than continuous polling of the mouse,
1156: and also less overhead than using motion events, since most motion events would
1157: be within windows and thus uninteresting.
1158:
1159: Due to the nature of overlapping windows, and because continuous tracking by
1160: the server is not guaranteed, the mouse may appear to move instantaneously
1161: between any pair of windows on the screen. Certainly the window the mouse was
1162: in should be notified of the mouse leaving, and the window the mouse is now in
1163: should be notified of the mouse entering. However, all of the windows "in
1164: between" in the hierarchy may also be interested in the transition. This is
1165: useful in simplifying the structure of some applications, and is necessary in
1166: implementing certain kinds of window managers and input managers. Thus, when
1167: the mouse moves from window A to window B, with window W as their closest
1168: (least) common ancestor, all ancestors of A below W also receive leave events,
1169: and all ancestors of B below W receive enter events.
1170:
1171: Except for mouse motion events, it might be argued that events are infrequent
1172: enough that the server should always send all events to the client, and
1173: eliminate the complexity of selecting events. However, some applications are
1174: written with interrupt-driven input; events are received asynchronously, and
1175: cause the current computation to be suspended so that the input can be
1176: processed. For example, a text editor might use interrupt-driven input, with
1177: the normal computation being redisplay of the window. The receipt of
1178: extraneous input events (for example, key release events) can cause noticeable
1179: "hiccups" in such redisplay.
1180:
1181: @section(Input and Window Management)
1182:
1183: There are two basic modes of keyboard management: @i(real-estate) and
1184: @i(listener). In real-estate mode, the keyboard "follows" the mouse; keyboard
1185: input is directed to whatever window the mouse is in. In listener mode,
1186: keyboard input is directed to a specific window, independent of the mouse
1187: position. Some systems provide only real-estate mode@cite(apollo,sunwin), some
1188: only listener mode@cite(lucasfilm,sapphire,pnx,mex,mg1,genera), and
1189: Andrew@cite(wm) provides both, although the mode cannot be changed during a
1190: session. Both modes are supported in X, and the mode can be changed
1191: dynamically. Real-estate mode is the default behavior, with the root window as
1192: the focus window, as described in the previous section. An input manager can
1193: also make some other (typically top-level) window the focus window, yielding
1194: listener mode. Note however, that in listener mode in X, the client
1195: controlling the focus window can still get real-estate behavior for subwindows,
1196: if desired; this capability has proven useful in several applications.
1197:
1198: The primary function of a window manager is reconfiguration: restacking,
1199: resizing, and repositioning top-level windows. The configuration of nested
1200: windows is assumed to be application-specific, and under control of the
1201: applications. There are two broad categories of window managers: manual and
1202: automatic. A manual window manager is "passive", and simply provides an
1203: interface to allow the user to manipulate the desktop; windows can be resized
1204: and reorganized at will. The initial size and position of a window typically
1205: (but not always) is under user or application control. Automatic window
1206: managers are "active", and operate for the most part without human interaction;
1207: size and position at window creation, and reconfiguration at window
1208: destruction, are chosen by the system. Automatic managers typically tile the
1209: screen with windows, such that no two windows overlap, automatically adjusting
1210: the layout as windows are created and destroyed. Andrew@cite(wm),
1211: Star@cite(star2), and Cedar@cite(cedar) provide automatic management, plus
1212: limited manual reconfiguration capability.
1213:
1214: Existing window managers for X are manual. Automatic management that is
1215: transparent to applications cannot be accomplished reasonably in X; future
1216: support for automatic management is discussed in Section 10. In the current X
1217: design, clients are responsible for initially sizing and placing their
1218: top-level windows, not window managers. In this way, applications continue to
1219: work when no window manager is present. Typically, the user either specifies
1220: geometry information in the application command line, or uses the mouse to
1221: sweep out a rectangle on the screen. (For the latter, the application grabs
1222: the mouse, as described below.)
1223:
1224: @subsection(Mouse-Driven Management)
1225:
1226: Existing managers are primarily mouse-driven, and are based on the ability to
1227: "steal" events. Specifically, a manager (or any other client) can @i(grab) a
1228: mouse button in combination with a set of modifier keys, with the following
1229: effect. Whenever the modifier keys are down and the button is pressed, the
1230: event is reported to the grabbing client, regardless of what window the mouse
1231: is in. All mouse-related events continue to be sent to that client until the
1232: button is released. As part of the grab, the client also specifies a mouse
1233: cursor to be used for the duration of the grab, and a window to be used as the
1234: event window. A manager specifies the root window as the event window when
1235: grabbing buttons; with the event propagation semantics described in Section 8,
1236: the grabbed events contain not only the global mouse coordinates, but also the
1237: top-level application window (if any) containing the mouse. This is sufficient
1238: information to manipulate top-level windows.
1239:
1240: Using this button-grab mechanism, several different management interfaces have
1241: been built, including a "programmable" interface@cite(uwm) allowing the user to
1242: assign individual commands or user-defined menus of commands to any number of
1243: button/modifier combinations. For example, a button click (press and release
1244: without intervening motion) might be interpreted as a command to raise or lower
1245: a window, or to attach the keyboard; a press/motion/release sequence might be
1246: interpreted as a command to move a window to a new position; or a button press
1247: might cause a menu to pop up, with the selection indicated by the mouse
1248: position at the release of the button. By allowing both specific commands and
1249: menus to be bound to buttons, a range of interfaces can be constructed to
1250: satisfy both "expert" and "novice" users.
1251:
1252: Another form of manager simply displays a static menu bar along the top of the
1253: screen, with items for such operations as moving a window and attaching the
1254: keyboard. The menu is used in combination with a mouse-grab primitive, with
1255: which a client can unilaterally grab the mouse and then later explicitly
1256: release it; during such a mouse-grab, events are redirected to the grabbing
1257: client, just as for button-grabs. When the user clicks on a menu bar item with
1258: any button, the manager unilaterally grabs the mouse. The user then uses the
1259: mouse to execute the specific command. For example, having clicked on the
1260: "move" item, the user indicates the window to move by placing the mouse in the
1261: window and pressing a button, then indicates the new position by moving the
1262: mouse and releasing the button. The manager then releases the mouse.
1263:
1264: @subsection(Icons)
1265:
1266: One important "resizing" operation performed by a window manager is
1267: transforming a window into a small icon and back again. In X, icons are merely
1268: windows. Transforming a window into an icon simply involves unmapping the
1269: window and mapping its associated icon. The association between a window and
1270: its icon is maintained in the server, rather than the window manager, and
1271: either the application or the manager can provide the icon. In this way, the
1272: manager can provide a default icon form for most clients, but clients can
1273: provide their own if desired, possibly with dynamic rather than static
1274: contents. The client is still insulated from management policy, even if it
1275: provides the icon: the manager is responsible for positioning, mapping, and
1276: unmapping the icon, and the client is responsible only for displaying the
1277: contents.
1278:
1279: The icon state is maintained in the server not only to allow clients to provide
1280: icons, but to avoid the loss of state if the window manager should terminate
1281: abnormally. When a window manager terminates, any windows it has created are
1282: destroyed, including icon windows. With knowledge of icons, the server can
1283: detect when an icon is destroyed, and automatically remap the associated client
1284: window. Without this, abnormal termination of the window manager would result
1285: in "lost" windows.
1286:
1287: @subsection(Race Conditions)
1288:
1289: There are many race conditions that must be dealt with in input and window
1290: management, due to the asynchronous nature of event handling. For example, if
1291: a manager attempts to grab the mouse in response to a press of a button, the
1292: mouse-grab request might not reach the server until after the button is
1293: released, and intervening mouse events would be missed. Or, if the user clicks
1294: on a window to attach the keyboard there, and then immediately begins typing,
1295: the first few keystrokes might occur before the manager actually responds to
1296: the click and the server actually moves the keyboard focus. A final example is
1297: a simple interface in which clicking on a window lowers it. Given a stack of
1298: three windows, the user might rapidly click twice in the same spot, expecting
1299: the top two windows to be lowered. Unless the first click is sent to the
1300: manager and the resulting request to lower is processed by the server before
1301: the second click takes place, the event window for the second click will be the
1302: same as for the first click, and the manager will lower the first window twice.
1303:
1304: A work-around for the last example, used by existing managers, is to ignore the
1305: event window reported in most events. Instead, the global mouse coordinates
1306: reported in the event are used in a follow-up query request to determine which
1307: top-level window now contains that coordinate. However, not all race
1308: conditions have acceptable solutions within the current X design. For a
1309: general solution, it must be possible for the manager to synchronize operations
1310: explicitly with event processing in the server. For example, a manager might
1311: specify that, at the press of a button, event processing in the server should
1312: cease until an explicit acknowledgment is received from the manager.
1313:
1314: @section(Future)
1315:
1316: Based on critiques from numerous universities and commercial firms, a fairly
1317: extensive evaluation and redesign of the X protocol has been underway since May
1318: 1986. Our desire is to define a "core" protocol that can serve as a standard
1319: for window system construction over the next several years. We expect to
1320: present the rationale for this new design in the very near future, once it has
1321: been validated by at least a preliminary implementation. In this section, we
1322: highlight the major protocol changes.
1323:
1324: @subsection(Resource Allocation)
1325:
1326: Since the server is responsible for assigning identifiers to resources, each
1327: resource allocation currently requires a round-trip time to perform. For
1328: applications that allocate many resources, this causes a considerable start-up
1329: delay. For example, a multi-pane menu might consist of dozens of windows,
1330: numerous fonts, and several different mouse cursors, leading to a delay of one
1331: second or longer.
1332:
1333: In retrospect, this is the most significant defect in the design of X. To get
1334: around these delays, programming interfaces have been augmented to provide
1335: "batch mode" operations. If several resources must be created, but there are
1336: no inter-dependencies among the allocation requests, all of the requests are
1337: sent in a batch, and then all of the replies are received. This effectively
1338: reduces the delay to a single round-trip time.
1339:
1340: A better solution to this problem is to make clients generate the identifiers.
1341: When the client establishes a connection to the server, it is given a specific
1342: subrange from which it can allocate. This change will significantly improve
1343: start-up times without affecting applications, as identifiers can be generated
1344: inside low-level libraries without changing programming interfaces.
1345:
1346: @subsection(Transparent Windows)
1347:
1348: One use of transparent windows is as clipping regions. However, they are
1349: unsatisfactory for this purpose because every coordinate in a graphics request
1350: must be translated by the client from the "real" window's origin to the
1351: transparent window's origin. A better approach to clipping regions is to allow
1352: clients to create clipping regions and attach them to all graphics requests.
1353: As noted in Section 6, X currently allows a clipping region in the form of a
1354: bitmap to be attached to a few graphics requests. Allowing a clipping region,
1355: specified either as a bitmap or a list of rectangles, to be attached to all
1356: graphics requests provides a more uniform mechanism.
1357:
1358: The major use of transparent windows to date is actually as inexpensive opaque
1359: windows. In the current server implementation, transparent windows can be
1360: created and transformed significantly faster than opaque windows. Because of
1361: this, transparent windows are often used when opaque windows would otherwise be
1362: adequate. We believe a new implementation of the server will improve the
1363: performance of opaque windows to the point that this will no longer be
1364: necessary.
1365:
1366: With explicit clipping regions added for graphics, and the performance
1367: advantages of transparent windows reduced, the only remaining use of
1368: transparent windows is for input (and cursor) control. Various applications
1369: want relatively fine-grained input control, and such control must not affect
1370: graphics output. Close control of cursor images and mouse motion events seems
1371: particularly important. However, the vast majority of the time control
1372: naturally is associated with normal window boundaries, so it would be unwise to
1373: divorce input control completely from windows. As such, the new protocol
1374: provides "input-only" windows, which act like normal windows for the purposes
1375: of input and cursor control, but which cannot be used as a source or
1376: destination in graphics requests, and which are completely invisible as far as
1377: output is concerned.
1378:
1379: @subsection(Color)
1380:
1381: X originally was not designed to deal with direct-color displays. Direct-color
1382: displays typically have between 12 and 36 bits per pixel; the pixel value
1383: consists of three subfields, which are used as indexes into three independent
1384: color maps: one for red intensities, one for green, and one for blue. Some
1385: direct-color displays also have a fourth subfield, sometimes referred to as
1386: "z-channel" information, used to control attributes such as blending or chroma
1387: keying. We now understand how to incorporate direct-color displays without
1388: z-channel information into X, in such a way that the differences between
1389: direct-color and pseudo-color color maps need not be apparent to the
1390: application, yet still allowing all of the usual color map tricks to played.
1391:
1392: At present there is only one color map for all applications, and color
1393: applications fail when this map gets full. Although dozens of applications
1394: typically can be run under X within a single 8-bit pseudo-color map, a single
1395: map is clearly unacceptable when dealing with small color maps, or with
1396: multiple applications (e.g., CAD tools) that need large portions of the color
1397: map. The solution is to support multiple virtual color maps, still permitting
1398: applications to coexist within any map, but allowing the possibility that not
1399: all applications show true color simultaneously. This also matches
1400: next-generation displays, which actually support multiple color maps in
1401: hardware@cite(rainbow).
1402:
1403: @subsection(Graphics)
1404:
1405: Perhaps the biggest mistake in the graphics area was failing to support fonts
1406: with kerning (side bearings). For example, a relatively complete emulation of
1407: the Andrew programming interface was built for X, but Andrew applications
1408: depend heavily on kerned fonts. There are other deficiencies that will be
1409: corrected. For example, large glyph-sets (e.g., Japanese) will be supported,
1410: as well as stippling (using a clip mask constructed by tiling a region with a
1411: bitmap). The notions of line width, join style, and end style found in
1412: PostScript@cite(postscript) are usually preferred to brush shapes for line
1413: drawing, and will be supported.
1414:
1415: In an attempt to support a wide range of devices, the exact path followed for
1416: lines and filled shapes was originally left undefined in X (the class of curve
1417: was not even specified). Different devices use slightly different algorithms
1418: to draw straight lines, and it seemed better to have high performance with
1419: minor variation than to have uniformity with poor performance. Relatively few
1420: devices support curve drawing in hardware, but some support it in firmware, and
1421: again performance seemed more important than accuracy. In retrospect, however,
1422: allowing such device dependent behavior was a poor decision. The vast majority
1423: of applications draw lines aligned on an axis, and speed and precision are not
1424: an issue. The applications that do require complex shapes also require
1425: predictable results, so precise specifications are important.
1426:
1427: A notable feature missing in X is the ability to perform graphics operations
1428: off screen. The reasons for this are essentially the same as those presented
1429: when discussing exposures in Section 7. In particular, not all graphics
1430: co-processors can operate on host memory, and emulating such processors can be
1431: expensive. However, application builders have demanded this capability, and
1432: the demand appears to be sufficient leverage to convince server implementors to
1433: provide the capability. Off-screen graphics will be possible in the new
1434: protocol, although the amount of off-screen memory and its performance
1435: characteristics may vary widely. In addition, the protocol is being extended
1436: to allow the manipulation of both images and windows of varying depths. For
1437: example, a server might support depths of 1, 4, 8, 12, and 24 bits. This
1438: allows imaging applications to transmit data more compactly, allows for more
1439: efficient memory utilization in the server, and provides a match with
1440: next-generation display hardware.
1441:
1442: A common debate in graphics systems is whether and where to have state. Should
1443: parameters such as logic function, plane mask, source pixel value or tile,
1444: tiling origin, font, line width and style, and clipping region be explicit in
1445: every request or collected into a state object? The current X protocol is
1446: stateless, for the following reasons: both state and stateless programming
1447: interfaces can be built easily on top of the protocol; the currently supported
1448: graphics requests have just few enough parameters that they can be represented
1449: compactly; and the initial set of displays we were interested in (and the
1450: implementations we had in mind for them) would not benefit from the addition of
1451: state. However, we now believe that a state-based protocol is generally
1452: superior, as it handles complex graphics gracefully and allows significantly
1453: faster implementations on some displays.
1454:
1455: @subsection(Management)
1456:
1457: An obvious interface style presently not supported in X is the ability to use
1458: the keyboard for management commands. To allow this, a key-grab mechanism,
1459: akin to the button-grab mechanism described in Section 9, will be provided. To
1460: allow such styles as using the first button click in a window to attach the
1461: keyboard, both button-grabs and key-grabs have been extended to apply to
1462: specific sub-hierarchies, rather than always to the entire screen. To handle
1463: the kinds of race conditions described in Section 9, a general event
1464: synchronization mechanism has been incorporated into the grab mechanisms.
1465:
1466: To support automatic window management, a manager must be able to intercept
1467: certain management requests from clients (such as mapping or moving a window)
1468: before they are executed by the server, and to be notified about others (such
1469: as unmapping a window) after they are executed. In addition, some managers
1470: want to provide uniform title bars and border decorations automatically. To
1471: allow this, it is useful to be able to "splice" hierarchies: to move a window
1472: from one parent to another. To allow input managers and window managers to be
1473: implemented as separate applications, the ability for multiple clients to
1474: select events on the same window is being added. For example, both a window
1475: manager and an input manager might be interested in the unmapping or
1476: destruction of a window.
1477:
1478: @subsection(Extensibility)
1479:
1480: The information that input and window managers might desire from applications
1481: is quite varied, and it would be a mistake to try and define a fixed set.
1482: Similarly, the information paths between applications (e.g., in support of "cut
1483: and paste") need to be flexible. To this end, we are adding a Lisp-ish
1484: property list@cite(CLtL) mechanism to windows, and the event mechanism is being
1485: augmented to provide a simple form of inter-client communication.
1486:
1487: The new X protocol explicitly continues to avoid certain areas, such as 3-D
1488: graphics and anti-aliasing. However, a general mechanism has been designed to
1489: allow extension libraries to be included in a server. The intention is that
1490: all servers implement the "core" protocol, but each server can provide
1491: arbitrary extensions. If an extension becomes widely accepted by the X
1492: community, it can be adopted as part of the core. Each extension library is
1493: assigned a global name, and an application can query the server at run-time to
1494: determine if a particular extension is present. Request opcodes and event
1495: types are allocated dynamically, so that applications need not be modified to
1496: execute in each new environment.
1497:
1498: @section(Summary)
1499:
1500: The X Window System provides high-performance, high-level, device-independent
1501: graphics. A hierarchy of resizable, overlapping windows allows a wide variety
1502: of application and user interfaces to be built easily. Network-transparent
1503: access to the display provides an important degree of functional separation,
1504: without significantly affecting performance, that is crucial to building
1505: applications for a distributed environment. To a reasonable extent, desktop
1506: management can be custom tailored to individual environments, without modifying
1507: the base system and typically without affecting applications.
1508:
1509: To date, the X design and implementation effort has focused on the base window
1510: system, as described in this paper, and in essential applications and
1511: programming interfaces. The design of the network protocol, the design and
1512: implementation of device-independent layer of server, and the implementation of
1513: several applications and a prototype window manager, were carried out by the
1514: first author. The design and implementation of the C programming interface,
1515: the implementation of major portions of several applications, and the
1516: coordination of efforts within Project Athena and Digital, were carried out by
1517: the second author. In addition, many other persons from Project Athena, the
1518: Laboratory for Computer Science, and institutions outside MIT have contributed
1519: software.
1520:
1521: Necessary applications such as window managers and VT100 and Tektronics 4014
1522: terminal emulators have been created, and numerous existing applications, such
1523: as text editors and VLSI layout systems, have been ported to the X environment.
1524: Although several different menu packages have been implemented, we are only now
1525: beginning to see a rich library of tools (scroll bars, frames, panels, more
1526: menus, etc.) to facilitate the rapid construction of high-quality user
1527: interfaces. Tool building is taking place at many sites, and several
1528: universities are now attempting to unify window systems work with X as a base,
1529: so that such tools can be shared.
1530:
1531: The use of X has grown far beyond anything we had imagined. Digital has
1532: incorporated X into a commercial product, and other manufacturers are following
1533: suit. With the appearance of such products, and the release of complete X
1534: sources on the Berkeley 4.3 Unix distribution tapes, it is no longer feasible
1535: to track all X use and development. Existing applications written in C are
1536: known to have been ported to seven machine architectures of more than twelve
1537: manufacturers, and the C server to six machine architectures and more than
1538: sixteen display architectures. In most cases the code is running under Unix,
1539: but other operating systems are also involved. In addition, relatively
1540: complete server implementations exist in two Lisp dialects. Apart from
1541: designing the system to be portable, a large part of this success is due to
1542: MIT's decision to distribute X sources without any licensing restrictions, and
1543: the willingness of people in both educational and commercial institutions to
1544: contribute code without restrictions.
1545:
1546: @b(Acknowledgments)
1547:
1548: Our thanks go to the many people who have contributed to the success of X.
1549: Particular thanks go to those who have made significant contributions to the
1550: non-proprietary implementation: Paul Asente (Stanford University), Scott Bates
1551: (Brown University), Mike Braca (Brown), Dave Bundy (Brown), Dave Carver
1552: (Digital), Tony Della Fera (Digital), Mike Gancarz (Digital), James Gosling
1553: (Sun Microsystems), Doug Mink (Smithsonian Astrophysical Observatory), Bob
1554: McNamara (Digital), Ron Newman (MIT), Ram Rao (Digital), Dave Rosenthal (Sun),
1555: Dan Stone (Brown), Stephen Sutphen (University of Alberta), and Mark
1556: Vandevoorde (MIT).
1557:
1558: Special thanks go to Digital Equipment Corporation. A redesign of the protocol
1559: and a reimplementation of the server to deal with color and to increase
1560: performance was made possible with funding (in the form of hardware) from
1561: Digital. To their credit, all of the resulting device-independent code
1562: remained the property of MIT.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.