|
|
1.1 ! root 1: @device(postscript) ! 2: @make(article) ! 3: @style(references=cacm) ! 4: @set(page=+1) ! 5: ! 6: @majorheading(The X Window System) ! 7: @center(Robert W. Scheifler@footnote( 545 Technology Square, Cambridge, MA 02139.) ! 8: MIT Laboratory for Computer Science ! 9: ! 10: Jim Gettys@footnote( Project Athena, MIT, Cambridge, MA 02139.) ! 11: Digital Equipment Corporation ! 12: MIT Project Athena ! 13: ! 14: July 1986 ! 15: Revised October 1986@footnote( To appear in Transactions on Graphics #63, ! 16: Special Issue on User Interface Software, Copyright 1986, ! 17: Association for Computing Machinery. Permission to copy without fee all or ! 18: part of this material is granted provided that the copies are not made or ! 19: distributed for direct commercial advantage, the ACM copyright notice and the ! 20: title of the publication and its date appear, ! 21: and notice is given that copying is by permission of the Association for ! 22: Computing Machinery. ! 23: To copy otherwise, or to republish requires a fee and/or specific permission.) ! 24: ! 25: @blankspace(2 lines) ! 26: ! 27: @begin(abstract) ! 28: ! 29: An overview of the X Window System is presented, focusing on the system ! 30: substrate and the low-level facilities provided to build applications and to ! 31: manage the desktop. The system provides high-performance, high-level, ! 32: device-independent graphics. A hierarchy of resizable, overlapping windows ! 33: allows a wide variety of application and user interfaces to be built easily. ! 34: Network-transparent access to the display provides an important degree of ! 35: functional separation, without significantly affecting performance, that is ! 36: crucial to building applications for a distributed environment. To a ! 37: reasonable extent, desktop management can be custom tailored to individual ! 38: environments, without modifying the base system and typically without affecting ! 39: applications. ! 40: ! 41: Categories and Subject Descriptors: C.2.2 [@b(Computer-Communication Networks)]: ! 42: Network Protocols - @i(protocol architecture); C.2.4 [@b(Computer-Communication ! 43: Networks)]: Distributed Systems - @i(distributed applications); D.4.4 [@b(Operating ! 44: Systems)]: Communication Management - @i(network communication, terminal management); ! 45: H.1.2 [@b(Information Systems)]: User/Machine Systems - @i(human factors); I.3.2 ! 46: [@b(Computer Graphics)]: Graphic Systems - @i(distributed/network graphics); ! 47: I.3.4 [@b(Computer Graphics)]: Graphics Utilities - @i(graphics packages, software ! 48: support); I.3.6 [@b(Computer Graphics)]: Methodology and Techniques - @i(device ! 49: independence, interaction techniques) ! 50: ! 51: General terms: Design, Experimentation, Human Factors, Standardization ! 52: ! 53: Additional Key Words and Phrases: window systems, window managers, virtual terminals ! 54: ! 55: @end(abstract) ! 56: ! 57: @section(Introduction) ! 58: ! 59: The X Window System (or simply X) developed at MIT has achieved fairly ! 60: widespread popularity recently, particularly in the Unix@footnote( Unix is a ! 61: trademark of AT&T Bell Laboratories.) community. In this paper, we present an ! 62: overview of X, focusing on the system substrate and the low-level facilities ! 63: provided to build applications and to manage the desktop. In X, this base ! 64: window system provides high-performance graphics to a hierarchy of resizable ! 65: windows. Rather than mandating a particular user interface, X provides ! 66: primitives to support several policies and styles. Unlike most window systems, ! 67: the base system in X is defined by a @i(network protocol): asynchronous ! 68: stream-based inter-process communication replaces the traditional procedure ! 69: call or kernel call interface. An application can utilize windows on any ! 70: display in a network in a device-independent, network-transparent fashion. ! 71: Interposing a network connection greatly enhances the utility of the window ! 72: system, without significantly affecting performance. The performance of ! 73: existing X implementations is comparable to contemporary window systems, and in ! 74: general is limited by display hardware rather than network communication. For ! 75: example, 19500 characters per second and 3500 short vectors per second are ! 76: possible on Digital Equipment Corporation's VAXStation-II/GPX, both locally and ! 77: over a local area network, and these figures are very close to the limits of ! 78: the display hardware. ! 79: ! 80: X is the result of the simultaneous need for a window system from two separate ! 81: groups at MIT. In the summer of 1984, the Argus system@cite(argus) at the ! 82: Laboratory for Computer Science needed a debugging environment for ! 83: multi-process distributed applications, and a window system seemed the only ! 84: viable solution. Project Athena@cite(athena) was faced with dozens, and ! 85: eventually thousands of workstations with bitmap displays, and needed a window ! 86: system to make the displays useful. Both groups were starting with the Digital ! 87: VS100 display@cite(vs100) and VAX hardware, but it was clear at the outset that ! 88: other architectures and displays had to be supported. In particular, equal ! 89: numbers of IBM workstations with bitmap displays of unknown type were expected ! 90: eventually within Project Athena. Portability was therefore a goal from the ! 91: start. Although all of the initial implementation work was for Berkeley Unix, ! 92: it was clear that the network protocol should not depend on aspects of the ! 93: operating system. ! 94: ! 95: The name X derives from the lineage of the system. At Stanford University, ! 96: Paul Asente and Brian Reid had begun work on the W window system@cite(w), as an ! 97: alternative to VGTS@cite(vgts1,vgts2) for the V system@cite(v). Both VGTS and ! 98: W allow network-transparent access to the display, using the synchronous V ! 99: communication mechanism. Both systems provide "text" windows for ASCII ! 100: terminal emulation. VGTS provides graphics windows driven by fairly high-level ! 101: object definitions from a structured display file; W provides graphics windows ! 102: based on a simple display-list mechanism, with limited functionality. We ! 103: acquired a Unix-based version of W for the VS100 (with synchronous ! 104: communication over TCP@cite(tcp)) done by Asente and Chris Kent at Digital's ! 105: Western Research Laboratory. From just a few days of experimentation, it was ! 106: clear that a network-transparent hierarchical window system was desirable, but ! 107: that restricting the system to any fixed set of application-specific modes was ! 108: completely inadequate. It was also clear that, although synchronous ! 109: communication was perhaps acceptable in the V system (due to very fast ! 110: networking primitives), it was completely inadequate in most other operating ! 111: environments. X is our "reaction" to W. The X window hierarchy comes directly ! 112: from W, although numerous systems have been built with hierarchy in at least ! 113: some form@cite(lucasfilm,star1,lispm,sunwin,mg1,genera,cedar,metheus,tajo). ! 114: The asynchronous communication protocol used in X is a significant improvement ! 115: over the synchronous protocol used in W, but is very similar to that used in ! 116: Andrew@cite(wm,andrew). X differs from all of these systems in the degree to ! 117: which both graphics functions and "system" functions are pushed back (across ! 118: the network) as application functions, and in the ability to transparently ! 119: tailor desktop management. ! 120: ! 121: The next section presents several high-level requirements that we believe a ! 122: window system must satisfy to be a viable standard in a network environment, ! 123: and indicates where the design of X fails to meet some of these requirements. ! 124: In Section 3 we describe the overall X system model, and the effect of ! 125: network-based communication on that model. Section 4 describes the structure ! 126: of windows, and the primitives for manipulating that structure. Section 5 ! 127: explains the color model used in X, and Section 6 presents the text and ! 128: graphics facilities. Section 7 discusses the issues of window exposure and ! 129: refresh, and their resolution in X. Section 8 deals with input event handling. ! 130: In Section 9, we describe the mechanisms for desktop management. ! 131: ! 132: This paper describes the version@footnote( Version 10.) of X that is currently ! 133: in widespread use. The design of this version is inadequate in several ! 134: respects. With our experience to date, and encouraged by the number of ! 135: universities and manufacturers taking a serious interest in X, we have designed ! 136: a new version that should satisfy a significantly wider community. Section 10 ! 137: discusses a number of problems with the current X design, and gives a general ! 138: idea of what changes are contemplated. ! 139: ! 140: @section(Requirements) ! 141: ! 142: A window system contains many interfaces. A @i(programming) interface is a ! 143: library of routines and types provided in a programming language for ! 144: interacting with the window system. Both low-level (e.g., line drawing) and ! 145: high-level (e.g., menus) interfaces are typically provided. An @i(application) ! 146: interface is the mechanical interaction with the user and the visual appearance ! 147: that is specific to the application. A @i(management) interface is the ! 148: mechanical interaction with the user dealing with overall control of the ! 149: desktop and the input devices. The management interface defines how ! 150: applications are arranged and rearranged on the screen, and how the user ! 151: switches between applications; an individual application interface defines how ! 152: information is presented and manipulated within that application. The @i(user) ! 153: interface is the sum total of all application and management interfaces. ! 154: ! 155: Besides applications, we distinguish three major components of a window system. ! 156: The @i(window manager)@footnote( Some people use this term for what we call the ! 157: base window system; that is not the meaning here.) implements the desktop ! 158: portion of the management interface; it controls the size and placement of ! 159: application windows, and also may control application window attributes such as ! 160: titles and borders. The @i(input manager) implements the remainder of the ! 161: management interface; it controls which applications see input from which ! 162: devices (e.g., keyboard and mouse). The @i(base window system) is the ! 163: substrate on which applications, window managers, and input managers are built. ! 164: ! 165: In this paper we are concerned with the base window system of X, with the ! 166: facilities it provides to build applications and managers. The following ! 167: requirements on the base window system crystallized during the design of X (a ! 168: few were not formulated until late in the design process): ! 169: ! 170: @begin(enumerate) ! 171: ! 172: @begin(multiple) ! 173: ! 174: The system should be implementable on a variety of displays. ! 175: ! 176: The system should work with nearly any bitmap display, and a variety of input ! 177: devices. Our design focused on workstation-class display technology likely to ! 178: be available in a university environment over the next few years. At one end ! 179: of the spectrum is a simple frame buffer and monochrome monitor, driven ! 180: directly by the host CPU with no additional hardware support. At the other end ! 181: of the spectrum is a multi-plane display with color monitor, driven by a ! 182: high-performance graphics co-processor. Input devices such as keyboards, mice, ! 183: tablets, joysticks, light pens, and touch screens should be supported. ! 184: ! 185: @end(multiple) ! 186: @begin(multiple) ! 187: ! 188: Applications must be device independent. ! 189: ! 190: There are several aspects to device independence. Most importantly, it must ! 191: not be necessary to rewrite, recompile, or even relink an application for each ! 192: new hardware display. Nearly as important, every graphics function defined by ! 193: the system should work on virtually every supported display; the alternative, ! 194: which is to use GKS-style inquire operations@cite(gks) to determine the set of ! 195: implemented functions at run-time, leads to tedious case analysis in every ! 196: application, and to inconsistent user interfaces. A third aspect of device ! 197: independence is that, as far as possible, applications should not need dual ! 198: control paths to work on both monochrome and color displays. ! 199: ! 200: @end(multiple) ! 201: @begin(multiple) ! 202: ! 203: The system must be network transparent: an application running on one ! 204: machine must be able to utilize a display on some other machine. The two ! 205: machines should not have to have the same architecture or operating system. ! 206: ! 207: There are numerous examples of why this important: a compute-intensive VLSI ! 208: design program executing on a mainframe, but displaying results on a ! 209: workstation; an application distributed over several stand-alone processors, ! 210: but interacting with a user at a workstation; a professor running a program on ! 211: one workstation, presenting results simultaneously on all student workstations. ! 212: ! 213: In a network environment, there are certain to be applications that must run on ! 214: particular machines or architectures. Examples include proprietary software, ! 215: applications depending on specific architectural properties, and programs ! 216: manipulating large databases. Such applications still should be accessible to ! 217: all users. In a truly heterogeneous environment, not all programming languages ! 218: and programming systems are supported on all machines, and it is very ! 219: undesirable to have to write an interactive front end in multiple languages in ! 220: order to make the application generally available. With network-transparent ! 221: access, this is not necessary; a single front end written in the same language ! 222: as the application suffices. ! 223: ! 224: One might think that remote display will be extremely infrequent, and that ! 225: performance therefore is much less important than for local display. ! 226: Experience at MIT, however, indicates that many users routinely make use of the ! 227: remote display capabilities in X, and that the performance of remote display is ! 228: quite important. The desktop display, although physically connected to a ! 229: single computer, is used as a true @i(network virtual terminal); indeed, the ! 230: idea of an X server (see the next section) built into a Blit-like ! 231: terminal@cite(blit) is an intriguing one. ! 232: ! 233: @end(multiple) ! 234: @begin(multiple) ! 235: ! 236: The system must support multiple applications displaying concurrently. ! 237: ! 238: For example, it should be possible to display a clock with a sweep second hand ! 239: in one window, while simultaneously editing a file in another window. ! 240: ! 241: @end(multiple) ! 242: @begin(multiple) ! 243: ! 244: The system should be capable of supporting many different application and ! 245: management interfaces. ! 246: ! 247: No single user interface is "best"; different communities have radically ! 248: different ideas about user interfaces. Even within a single community, ! 249: "experts" and "novices" place different demands on an interface. Rather than ! 250: mandating a particular user interface, the base window system should support a ! 251: wide range of interfaces. ! 252: ! 253: To achieve this, the system must provide @i(hooks) (mechanism) rather than ! 254: @i(religion) (policy). For example, since menu styles and semantics vary ! 255: dramatically among different user interfaces, the base window system must ! 256: provide primitives from which menus can be built, rather than just providing a ! 257: fixed menu facility. ! 258: ! 259: The system should be designed in such a way that it is possible to implement ! 260: management policy both external to the base window system and external to ! 261: applications. Applications should be largely independent of management policy ! 262: and mechanism; applications should @i(react to) management decisions, rather ! 263: than @i(directing) those decisions. For example, an application needs to be ! 264: informed when one of its windows is resized, and should react by reformatting ! 265: the information displayed, but involvement of the application should not be ! 266: required in order for the user to change the size. Making applications ! 267: management-independent, as well as device-independent, facilitates the sharing ! 268: of applications between diverse cultures. ! 269: ! 270: @end(multiple) ! 271: @begin(multiple) ! 272: ! 273: The system must support overlapping windows, including output to partially ! 274: obscured windows. ! 275: ! 276: This is in some sense a by-product of the previous requirement, but is ! 277: important enough to merit explicit statement. Not all user interfaces allow ! 278: windows to overlap arbitrarily. However, even interfaces that do not allow ! 279: application windows to overlap typically provide some form of pop-up menu that ! 280: overlaps application windows. If such menus are built from windows, then ! 281: support for overlapping windows must exist. ! 282: ! 283: @end(multiple) ! 284: @begin(multiple) ! 285: ! 286: The system should support a hierarchy of resizable windows, and an application ! 287: should be able to use many windows at once. ! 288: ! 289: Subwindows provide a clean, powerful mechanism for exporting much of the basic ! 290: system machinery back to the application for direct use. Many applications ! 291: make use of their own window-like abstractions; some even implement what is ! 292: essentially another window system, nested within the "real" window system. It ! 293: is important to support arbitrary levels of nesting. What is viewed as a ! 294: single window at one abstraction level may well require multiple subwindows at ! 295: a lower level. By providing a true window hierarchy, application windows can ! 296: be implemented as true windows within the system, freeing the application from ! 297: duplicating machinery such as clipping and input control. ! 298: ! 299: @end(multiple) ! 300: @begin(multiple) ! 301: ! 302: The system should provide high-performance, high-quality support for text, ! 303: 2-D synthetic graphics, and imaging. ! 304: ! 305: The base window system must provide "immediate" or "transparent" graphics: the ! 306: application describes the image precisely, and the system does not attempt to ! 307: second-guess the application. The use of high-level models, whereby the ! 308: application describes @i(what) it wants in terms of fairly abstract objects and ! 309: the system determines @i(how) best to render the image, cannot be imposed as ! 310: the only form of graphics interface. Such models generally fail to provide ! 311: adequate support for some important class of applications, and different user ! 312: communities tend to have strong opinions about which model is "best". ! 313: High-level models are extremely important to provide, but they should be built ! 314: in layers on top of the base window system. ! 315: ! 316: Support for 3-D graphics is not listed as a requirement, but this is not to say ! 317: it is unimportant. We simply have not considered 3-D graphics, due to lack of ! 318: expertise and lack of time. ! 319: ! 320: @end(multiple) ! 321: @begin(multiple) ! 322: The system should be extensible. ! 323: ! 324: For example, the core system may not support 3-D graphics, but it should be ! 325: possible to extend the system with such support. The extension mechanism ! 326: should allow communities to extend the system non-cooperatively, yet allow such ! 327: independent extensions to be merged gracefully. ! 328: ! 329: @end(multiple) ! 330: @end(enumerate) ! 331: ! 332: We believe that a window system must satisfy these requirements to be a viable ! 333: standard in an environment of high-performance workstations and mainframes ! 334: connected via high-performance local area networks. X satisfies most of these ! 335: requirements, but currently fails to satisfy a few due to practical ! 336: considerations of staffing and time constraints: the design and much of the ! 337: implementation of the base window system was to be handled solely by the first ! 338: author; it was important to get a working system up fairly quickly; and the ! 339: immediate applications only required relatively simple text and graphics ! 340: support. As a result, X is not designed to handle high-end color displays or ! 341: to deal with input devices other than a keyboard and mouse; some support for ! 342: high-quality text and graphics is missing; X only provides support for one ! 343: class of management policy; and no provision has been made for extensions. As ! 344: discussed in Section 10, these and other problems are being addressed in a ! 345: redesign of X. ! 346: ! 347: @begin(fullpagefigure) ! 348: @blankspace(7 inches) ! 349: @caption(System Structure) ! 350: @end(fullpagefigure) ! 351: ! 352: @section(System Model) ! 353: ! 354: The X window system is based on a client-server model; this model follows ! 355: naturally from requirements two and three in the previous section. For each ! 356: physical display, there is a controlling server. A client application and a ! 357: server communicate over a reliable duplex (8-bit) byte stream. A simple block ! 358: stream protocol is layered on top of the byte stream. If the client and server ! 359: are on the same machine, the stream is typically based on a local inter-process ! 360: communication (IPC) mechanism, and otherwise a network connection is ! 361: established between the pair. Requiring nothing more than a reliable duplex ! 362: byte stream (without urgent data) for communication makes X usable in many ! 363: environments. For example, the X protocol can be used over TCP@cite(tcp), ! 364: DECnet@cite(decnet), and Chaos@cite(chaos). ! 365: ! 366: Multiple clients can have connections open to a server simultaneously, and a ! 367: client can have connections open to multiple servers simultaneously. The ! 368: essential tasks of the server are to multiplex requests from clients to the ! 369: display, and demultiplex keyboard and mouse input back to the appropriate ! 370: clients. Typically, the server is implemented as a single sequential process, ! 371: using round-robin scheduling among the clients, and this centralized control ! 372: trivially solves many synchronization problems; however, a multi-process server ! 373: has also been implemented. Although one might place the server in the kernel ! 374: of the operating system in an attempt to increase performance, a user-level ! 375: server process is vastly easier to debug and maintain, and performance under ! 376: Unix in fact does not seem to suffer. Similar performance results have been ! 377: obtained in Andrew@cite(wm). Various tricks are used in both clients and ! 378: server to optimize performance, principally by minimizing the number of ! 379: operating system calls@cite(hacks). ! 380: ! 381: The server encapsulates the base window system. It provides the fundamental ! 382: resources and mechanisms, and the hooks required to implement various user ! 383: interfaces. All device dependencies are encapsulated by the server; the ! 384: communication protocol between clients and the server is device independent. ! 385: By placing all device dependencies on one end of a network connection, ! 386: applications are truly device independent. The addition of a new display type ! 387: simply requires the addition of a new server implementation; no application ! 388: changes are required. Of course, the server itself is designed as device ! 389: independent code layered on top of a device dependent core, so only the "back ! 390: end" of the server need be reimplemented for each new display.@footnote( A back ! 391: end has been implemented using a programming interface to X itself, such that a ! 392: complete "recursive" X server executes inside a window of another X server.) ! 393: ! 394: @subsection(Network Considerations) ! 395: ! 396: It is extremely important for the server to be robust with respect to client ! 397: failures. The server, and the network protocol, must be designed so that the ! 398: server never trusts clients to provide correct data. As a corollary, the ! 399: protocol must be designed in such a way that, if the server ever has to wait ! 400: for a response from a client, it must be possible to continue servicing other ! 401: clients. Without this property, a buggy client or a network failure could ! 402: easily cause the entire display to freeze up. ! 403: ! 404: Byte ordering is a standard problem in network communication: when a 16-bit or ! 405: 32-bit quantity is transmitted over an 8-bit byte stream, is the most ! 406: significant byte transmitted first (big-endian byte order) or is the least ! 407: significant byte transmitted first (little-endian byte order)? Some machines ! 408: with byte-addressable memory use big-endian order internally, and others use ! 409: little-endian order. If a single order is chosen for network communication, ! 410: some machines will suffer the overhead of swapping bytes, even when ! 411: communicating with a machine using the same internal byte order. Such an ! 412: approach also means that both parties in the communication must worry about ! 413: byte order. ! 414: ! 415: The X protocol uses a different approach. The server is designed to accept ! 416: both big-endian and little-endian connections. For example, using TCP this is ! 417: accomplished by having the server listen on two distinct ports; little-endian ! 418: clients connect to the server on one port, and big-endian clients connect on ! 419: the other. Clients always transmit and receive in their native byte order. ! 420: The server alone is responsible for byte swapping, and byte swapping only ! 421: occurs between dissimilar architectures. This eliminates the byte swapping ! 422: overhead in the most common situations, and greatly simplifies the building of ! 423: client-side interface libraries in various programming languages. X is not ! 424: unique in its use of this trick; the current VGTS implementation uses the same ! 425: trick, and similar protocol optimizations have been used in various ! 426: network-based applications. ! 427: ! 428: Another potential problem in protocol design is word alignment. In particular, ! 429: some architectures require 16-bit quantities to be aligned on 16-bit boundaries ! 430: and 32-bit quantities to be aligned on 32-bit boundaries in memory. To allow ! 431: efficient implementations of the protocol across a spectrum of 16-bit and ! 432: 32-bit architectures, the protocol is defined to consist of blocks that are ! 433: always multiples of 32 bits, and each 16-bit and 32-bit quantity within a block ! 434: is aligned on 16-bit and 32-bit boundaries, respectively. ! 435: ! 436: X is designed to operate in an environment where the inter-process ! 437: communication round-trip time is between 5 and 50 milliseconds, both for local ! 438: and for network communication. We also assume that data transmission rates are ! 439: comparable to display rates; for example, to transmit and display 5000 ! 440: characters per second, a data rate of approximately 50Kb (kilobits per second) ! 441: will be needed, and to transmit and display 20000 characters per second, a data ! 442: rate of approximately 200Kb will be needed. Networks and protocol ! 443: implementations with these characteristics are now quite commonplace. For ! 444: example, workstations running Berkeley Unix, connected via 10Mb (megabits per ! 445: second) local area networks, typically have round-trip times of 15 to 30 ! 446: milliseconds, and data rates of 500Kb to 1Mb. ! 447: ! 448: The round-trip time is important in determining the form of the communication ! 449: protocol. The most common communication will be text and graphics requests ! 450: sent from a client to the server. Examples of individual requests might be to ! 451: draw a string of text or to draw a line. Such requests could be sent either ! 452: synchronously, in which case the client sends a request only after receiving a ! 453: reply from the server to the previous request, or they could be sent ! 454: asynchronously, without the server generating any replies. However, since the ! 455: requests are sent over a reliable stream, they are guaranteed to arrive, and ! 456: arrive in order, so replies from the server to graphics requests serve no ! 457: useful purpose. Moreover, with round-trip times over 5 milliseconds, output to ! 458: the display must be asynchronous, or it will be impossible to drive high-speed ! 459: displays adequately. For example, at 80 characters per request and a 25 ! 460: millisecond round-trip time, only 3200 characters per second can be drawn ! 461: synchronously, whereas many hardware devices are capable of displaying between ! 462: 5000 and 30000 characters per second. ! 463: ! 464: Similarly, polling the server for keyboard and mouse input would be ! 465: unacceptable in many applications, particularly those written in sequential ! 466: languages. For example, an application attempting to provide real-time ! 467: response to input has to poll periodically for input during screen updates. ! 468: For an application with a single thread of control, this effectively results in ! 469: synchronous output, and consequent performance loss. Hence, input must be ! 470: generated asynchronously by the server, so that applications need at most ! 471: perform local polling. ! 472: ! 473: The round-trip time is also important in determining what user interfaces can ! 474: be supported without embedding them directly in the server. The most important ! 475: concern is whether remote, application-level mouse tracking is feasible. By ! 476: @i(tracking), we do not mean maintaining the cursor image on the screen as the ! 477: user moves the mouse; that function is performed autonomously by the X server, ! 478: often directly in hardware. Rather, applications track the mouse by animating ! 479: some other image on the screen in real time as the mouse moves. For round-trip ! 480: times under 50 milliseconds, tracking is perfectly reasonable, driven either by ! 481: motion events generated by the server or by continuous polling from the ! 482: application. With a refresh occurring up to 30 times every second, remote ! 483: tracking is demonstrably "instantaneous" with mouse motion. ! 484: ! 485: For tracking to be effective, however, relatively little time can be spent ! 486: updating the display at each movement, so typically only relatively small ! 487: changes can be made to the screen while tracking. This is certainly the case ! 488: for common operations, such as rubber banding window outlines and highlighting ! 489: menu items. It might be argued that the ability to run application-specific ! 490: code in the server is required for acceptable hand-eye coordination during ! 491: complex tracking. For example, NeWS@cite(news) provides such a mechanism in a ! 492: novel way. However, we are not convinced there are sufficient benefits to ! 493: justify such complexity. Complex tracking typically is bound up intimately ! 494: with application-specific data structures and knowledge representations, and ! 495: such information is used by the "back end" of the application as well as the ! 496: "front end". In a distributed system it is folly to believe that applications ! 497: will download large front ends into a server; communication round-trip times ! 498: are a reality that cannot be escaped. ! 499: ! 500: @subsection(Resources) ! 501: ! 502: The basic resources provided by the server are windows, fonts, mouse cursors, ! 503: and off-screen images; later sections describe each of these. Clients request ! 504: creation of a resource by supplying appropriate parameters (such as the name of ! 505: the font); the server allocates the resource and returns a 31-bit unique ! 506: identifier used to represent it. The use and interpretation of a resource ! 507: identifier is independent of any network connection. Any client that knows (or ! 508: guesses) the identifier for a resource can use and manipulate the resource ! 509: freely, even if it was created by another client. This capability is required ! 510: to allow window managers to be written independently of applications, and to ! 511: allow multi-process applications to manipulate shared resources. However, to ! 512: avoid problems associated with clients that fail to clean up their resources at ! 513: termination (which is all too common in operating systems where users can ! 514: unilaterally abort processes), the maximum lifetime of a resource is always ! 515: tied to the connection over which it was created. Thus, when a client ! 516: terminates, all of the resources it created are destroyed automatically. ! 517: ! 518: Access control is performed only when a client attempts to establish a ! 519: connection to the server; once the connection is established the client can ! 520: freely manipulate any resource. Since accidental manipulation of some other ! 521: client's resource is extremely unlikely (both in theory and in practice), we ! 522: believe introducing access control on a per-resource basis would only serve to ! 523: decrease performance, not to significantly increase security or robustness. ! 524: The current access control mechanism is based simply on host network addresses, ! 525: as this information is provided by most network stream protocols, and there ! 526: seems to be no widely used or even widely available user-level authentication ! 527: mechanism. Host-based access control has proven to be marginally acceptable in ! 528: a workstation environment, but is rather unacceptable for time-shared ! 529: machines.@footnote( It is interesting that @i(professors) at MIT have argued ! 530: vociferously to disable all access control.) ! 531: ! 532: Each client-generated protocol request is a simple data block consisting of an ! 533: opcode, some number of fixed-length parameters, and possibly a variable-length ! 534: parameter. For example, to display text in a window, the fixed-length ! 535: parameters include the drawing color and the identifiers for the window and the ! 536: font, and the variable-length parameter is the string of characters. All ! 537: operations on a resource explicitly contain the identifier of the resource as a ! 538: parameter. In this way, an application can multiplex use of many windows over ! 539: a single network connection. This multiplexing makes it easy for the client to ! 540: control the time-order of updates to multiple windows. Similarly, each input ! 541: event generated by the server contains the identifier of the window in which ! 542: the event occurred. Multiplexing over a single stream allows the client to act ! 543: on events from multiple windows in correct time order; timestamps alone are ! 544: inadequate without strong guarantees from the stream mechanism. ! 545: ! 546: Numerous Unix-based window ! 547: systems@cite(masscomp,andrew,sapphire,pnx,sunwin,mg1,metheus) use file or ! 548: channel descriptors to represent windows; window creation involves an ! 549: interaction with the operating system, which results in the creation of such a ! 550: descriptor. Typically, this means the window cannot be named (and hence cannot ! 551: be shared) by programs running on different machines, and perhaps not even by ! 552: programs running on the same machine. More serious, there is often a severe ! 553: restriction on the number of active descriptors a process may have: 20 on ! 554: older systems and usually 64 on newer systems. The use of 50 or more windows ! 555: (albeit nested inside a single top-level window) is quite common in X ! 556: applications. The use of a single connection, over which an arbitrary number ! 557: of windows can be multiplexed, is clearly a better approach. ! 558: ! 559: @section(Window Hierarchy) ! 560: ! 561: The server supports an arbitrarily branching hierarchy of rectangular windows. ! 562: At the top is the @i(root) window, which covers the entire screen. The ! 563: @i(top-level) windows of applications are created as subwindows of the root ! 564: window. The window hierarchy models the now-familiar "stacks of papers" ! 565: desktop. For a given window, its subwindows can be stacked in any order, with ! 566: arbitrary overlaps. When window W1 partially or completely covers window W2, ! 567: we say that W1 @i(obscures) W2. This relationship is not restricted to ! 568: siblings; if W1 obscures W2, then W1 may also obscure subwindows of W2. A ! 569: window also obscures its parent. Window hierarchies never interleave; if ! 570: window W1 obscures sibling window W2, then subwindows of W2 never obscure W1 or ! 571: subwindows of W1. A window is not restricted in size or placement by the ! 572: boundaries of its parent, but a window is always visibly clipped by its parent: ! 573: portions of the window that extend outside the boundaries of the parent are ! 574: never displayed, and do not obscure other windows. Finally, a window can be ! 575: either @i(mapped) or @i(unmapped). An unmapped window is never visible on the ! 576: screen; a mapped window can only be visible if all of its ancestors are also ! 577: mapped. ! 578: ! 579: Output to a leaf window (one with no subwindows) is always clipped to the ! 580: visible portions of the window; drawing on such a window never draws into ! 581: obscuring windows. Output to a window that contains subwindows can be ! 582: performed in two modes. In @i(clipped) mode the output is clipped normally by ! 583: all obscuring windows (including subwindows), but in @i(draw-through) mode the ! 584: output is not clipped by subwindows. For example, draw-through mode is used on ! 585: the root window during window management, tracking the mouse with the outline ! 586: of a window to indicate how the window is to be moved or resized. If clipped ! 587: mode were used instead, the entire outline would not be visible. ! 588: ! 589: The coordinate system is defined with the X axis horizontal and the Y axis ! 590: vertical. Each window has its own coordinate system, with the origin at the ! 591: upper left corner of the window. Having per-window coordinate systems is ! 592: crucial, particularly for top-level windows; applications are almost always ! 593: designed to be insensitive to their position on the screen, and having to worry ! 594: about race conditions when moving windows would be a disaster. The coordinate ! 595: system is discrete: each pixel in the window corresponds to a single unit in ! 596: the coordinate system, with coordinates centered on the pixels, and all ! 597: coordinates are expressed as integers in the protocol. We believe fractional ! 598: coordinates are not required at the protocol level for the raster graphics ! 599: provided in X (see section 6), although they may be required for high-end color ! 600: graphics, such as anti-aliasing. The aspect ratio of the screen is not masked ! 601: by the protocol, since we believe that most displays have a one to one aspect ! 602: ratio; in this regard X is arguably device dependent. ! 603: ! 604: Although the coordinate system is discrete at the protocol level, continuous or ! 605: alternate-origin coordinate systems certainly can be used at the application ! 606: level, but client-side libraries must eventually translate to the discrete ! 607: coordinates defined by the protocol. In this way, we can ignore the many ! 608: variations in floating-point (or even fixed-point) formats among architectures. ! 609: Further, the coordinates can be expressed in the protocol as 16-bit quantities, ! 610: which can be manipulated efficiently in virtually every machine/display ! 611: architecture, and which minimizes the number of data bytes transmitted over the ! 612: network. The use of 16-bit quantities does have a drawback, in that some ! 613: applications (particularly CAD tools) like to perform zoom operations simply by ! 614: scaling coordinates and redrawing, relying on the window system to clip ! 615: appropriately. Since scaling quickly overflows 16 bits, additional clipping ! 616: must be performed explicitly by such applications. ! 617: ! 618: A window can optionally have a @i(border), a shaded outer frame maintained ! 619: explicitly by the X server. The origin of the window's coordinate system is ! 620: inside the border, and output to the window is clipped automatically so as not ! 621: to extend into the border. The presence of borders slightly complicates the ! 622: semantics of the window system; for simplicity we will ignore them in the ! 623: remainder of this paper. ! 624: ! 625: The basic operations on window structure are straightforward. An unmapped ! 626: window is created by specifying the parent window, the position within the ! 627: parent of the upper left corner of the new window, and the width and height (in ! 628: coordinate units) of the new window. A window can be destroyed, in which case ! 629: all windows below it in the hierarchy are also destroyed. A window can be ! 630: mapped and unmapped, without changing its position. A window can be moved and ! 631: resized, including being moved and resized simultaneously. A window can also ! 632: be "depthwise" raised to the top or lowered to the bottom the stack with ! 633: respect to its siblings, without changing its coordinate position. Currently ! 634: mapping or configuring a window forces the window to be raised. This ! 635: restriction appeared to simplify the server implementation, but also happened ! 636: to match the basic management interface we expected to build. This restriction ! 637: will be eliminated in the next version. ! 638: ! 639: The windows described above are the usual @i(opaque) windows. X also provides ! 640: @i(transparent) windows. A transparent window is always invisible on the ! 641: screen, and does not obscure output to, or visibility of, other windows. ! 642: Output to a transparent window is clipped to that window, but is actually drawn ! 643: on the parent window. Thus, for output, a transparent window is simply a ! 644: clipping rectangle that can be applied to restrict output within a (parent) ! 645: window. Input processing for transparent and opaque windows is identical, as ! 646: described in Section 8. In Section 10 we will argue that most uses of ! 647: transparent windows are better satisfied with other mechanisms. Therefore, for ! 648: simplicity, we will ignore transparent windows in the rest of this paper. ! 649: ! 650: The X server is designed explicitly to make windows inexpensive. Our goal was ! 651: to make it reasonable to use windows for such things as individual menu items, ! 652: buttons, even individual items in forms and spreadsheets. As such, the server ! 653: must deal efficiently with hundreds (though not necessarily thousands) of ! 654: windows on the screen simultaneously. Experience with X has shown that many ! 655: implementors find this capability extremely useful. ! 656: ! 657: @section(Color) ! 658: ! 659: The screen is viewed as two dimensional, with an N-bit @i(pixel) value stored ! 660: at each coordinate. The number of bits in a pixel value, and how a value ! 661: translates into a color, depends on the hardware. X is designed to support two ! 662: types of hardware: monochrome and pseudo-color. A monochrome display has one ! 663: bit per pixel, and the two values translate into black and white. Pseudo-color ! 664: displays typically have between four and twelve bits per pixel; the pixel value ! 665: is used as an index into a color map, yielding red, green, and blue ! 666: intensities. The color map can be changed dynamically, so that a given pixel ! 667: value can represent different colors over time. Gray-scale is viewed as a ! 668: degenerate case of pseudo-color. ! 669: ! 670: We desire a design matching most display hardware, while abstracting ! 671: differences in such a way that programmers do not have to double or triple-code ! 672: their applications to cover the spectrum. We also want multiple applications ! 673: to coexist within a single color map, so that applications always show true ! 674: color on the screen. To allow this, and to keep applications device ! 675: independent, pixel values should not be coded explicitly into applications. ! 676: Instead, the server must be responsible for managing the color map, and color ! 677: map allocation must be expressed in hardware-independent terms. ! 678: ! 679: All graphics operations in X are expressed in terms of pixel values. For ! 680: example, to draw a line, one specifies not only the coordinates of the ! 681: end-points but the pixel value with which to draw the line. (Logic functions ! 682: and plane-select masks are also specified, as described in Section 6.) On a ! 683: monochrome display, the only two pixel values are zero and one, which are ! 684: (somewhat arbitrarily) defined to be black and white, respectively. On a ! 685: pseudo-color display, pixel values zero and one are pre-allocated by the ! 686: server, for use as "black" and "white", so that monochrome applications display ! 687: correctly on color displays. Of course, the actual colors need not be black ! 688: and white, but can be set by the user. ! 689: ! 690: There are two ways for a client to obtain pixel values. In the simplest ! 691: request, the client specifies red, green, and blue color values, and the server ! 692: allocates an arbitrary pixel value and sets the color map so the pixel value ! 693: represents the closest color the hardware can provide. The color map entry for ! 694: this pixel value cannot be changed by the client, so if some other client ! 695: requests an equivalent color, the server is free to respond with the same pixel ! 696: value. Such sharing is important in maximizing use of the color map. To ! 697: isolate applications from variations in color representation among displays ! 698: (due, for example, to the standard of illumination used for calibration), the ! 699: server provides a color database which clients can use to translate string ! 700: names of colors into red, green, and blue values tailored for the particular ! 701: display. ! 702: ! 703: The second request allocates writable map entries. This mechanism was designed ! 704: explicitly for X; we are not aware of a comparable mechanism in any other ! 705: window system. The client specifies two numbers, @i(C) and @i(P), with @i(C) ! 706: positive and @i(P) non-negative; the request can be expressed as "allocate ! 707: @i(C) colors and @i(P) planes". The total number of pixel values allocated by ! 708: the server is @i(C*2@+(P)). The values passed back to the client consist of ! 709: @i(C) base pixel values, and a plane mask containing @i(P) bits. None of the ! 710: base pixel values have any one bits in common with the plane mask, and the ! 711: complete set of allocated pixel values is obtained by combining all possible ! 712: combinations of one bits from the plane mask with each of the base pixel ! 713: values. The client can optionally require the @i(P) planes to be contiguous, ! 714: in which case all @i(P) bits in the plane mask will be contiguous. ! 715: ! 716: There are three common uses of this second request. One is simply to allocate ! 717: a number of "unrelated" pixel values; in this case, @i(P) will be zero. A ! 718: second use is in imaging applications, where it is convenient to be able to ! 719: perform simple arithmetic on pixel values. In this case, a contiguous block of ! 720: pixel values is allocated by setting @i(C) to one and @i(P) to the log (base 2) ! 721: of the number of pixel values required, and requesting contiguous allocation. ! 722: Arithmetic on the pixel values then requires at most some additional shift and ! 723: mask operations. ! 724: ! 725: A third form of allocation arises in applications that want some form of ! 726: overlay graphics, such as highlighting or outlining regions. Here the ! 727: requirement is to be able to draw and then erase graphics without disturbing ! 728: existing window contents. For example, suppose an application typically uses ! 729: four colors, but needs to be able to overlay a rectangle outline in a fifth ! 730: color. An allocation request with C set to four and P set to one results in ! 731: two groups of four pixel values. The four base pixel values are assigned the ! 732: four normal colors, and the four alternate pixel values are all assigned the ! 733: fifth color. Overlay graphics can then be drawn by restricting output (see the ! 734: next section) to the single bit plane specified in the mask returned by the ! 735: color allocation. Turning bits in this plane on (to ones) changes the image to ! 736: the fifth color, and turning them off reverts the image to its original color. ! 737: ! 738: @section(Graphics and Text) ! 739: ! 740: Graphics operations are often the most complex part of any window system, ! 741: simply because so many different effects and variations are required to satisfy ! 742: a wide range of applications. In this section we sketch the operations ! 743: provided in X, so that the basic level of graphics support can be understood. ! 744: The operations are essentially a subset of the Digital Workstation Graphics ! 745: Architecture; the VS100 display@cite(vs100) implements this architecture for ! 746: 1-bit pixel values. The set of operations purposely was kept simple, in order ! 747: to maximize portability. ! 748: ! 749: Graphics operations in X are expressed in terms of relatively high-level ! 750: concepts, such as lines, rectangles, curves, and fonts. This is in contrast to ! 751: systems in which the basic primitives are to read and write individual pixels. ! 752: Basing applications on pixel-level primitives works well when display memory ! 753: can be mapped into the application's address space for direct manipulation. ! 754: However, both display hardware and operating systems exist for which such ! 755: direct access is not possible, and emulating pixel-level manipulations in such ! 756: an environment results in extremely poor performance. Expressing operations at ! 757: a higher level avoids such device dependencies, and also avoids potential ! 758: problems with network bandwidth. With high-level operations, a protocol ! 759: request transmitted as a small number of bits over the network typically ! 760: affects ten to one hundred times as many pixels on the screen. ! 761: ! 762: @subsection(Images) ! 763: ! 764: Two forms of off-screen images are supported in X: bitmaps and pixmaps. A ! 765: bitmap is a single plane (bit) rectangle. A pixmap is an N-plane (pixel) ! 766: rectangle, where @i(N) is the number of bits per pixel used by the particular ! 767: display. A bitmap or pixmap can be created by transmitting all of the bits to ! 768: the server; a pixmap can also be created by copying a rectangular region of a ! 769: window. Bitmaps and pixmaps of arbitrary size can be created. Transmitting ! 770: very large (or deep) images over a network connection can be quite slow; ! 771: however, the ability to make use of shared memory in conjunction with the IPC ! 772: mechanism would help enormously when the client and server are on the same ! 773: machine. ! 774: ! 775: The primary use of bitmaps is as masks (clipping regions). Several graphics ! 776: requests allow a bitmap to be used as a clipping region@cite(warnock). Bitmaps ! 777: are also used to construct cursors, as described in Section 8. Pixmaps are ! 778: used to store frequently drawn images, and as temporary backing-store for ! 779: pop-up menus (as described in Section 8). However, the principal use of ! 780: pixmaps is as tiles, that is, as patterns which are replicated in two ! 781: dimensions to cover a region. Since there are often hardware restrictions as ! 782: to what tile shapes can be replicated efficiently, guaranteed shapes are not ! 783: defined by the X protocol. An application can query the server to determine ! 784: what shapes are supported, although to date most applications simply assume 16 ! 785: by 16 tiles are supported. A better semantics is to support arbitrary shapes, ! 786: but allow applications to query as to which shapes are most efficient. ! 787: ! 788: The tiling origin used in X is almost always the origin of the destination ! 789: window. That is, if enough tiles were laid out, one tile would have its upper ! 790: left corner at the upper left corner of the window. In this way, the contents ! 791: of the window are independent of the window's position on the screen, and the ! 792: window can be moved transparently to the application. ! 793: ! 794: Servers vary widely in the amount of off-screen memory provided. For example, ! 795: some servers limit off-screen memory to that accessible directly to the ! 796: graphics processor (typically one to three times the size of screen memory), ! 797: and fonts and other resources are allocated from this same pool. Other servers ! 798: utilize their entire virtual address space for off-screen memory. Since ! 799: off-screen memory for images is finite, an explicit part of the X protocol is ! 800: the possibility that bitmap or pixmap creation can fail. Depending on the ! 801: intended use of the image, the application may or may not be able to cope with ! 802: the failure. For example, if the image was being stored simply to speed up ! 803: redisplay, the application can always transmit the image directly each time ! 804: (see below). If the image was to be a temporary backing-store for a window, ! 805: the application can fall back on normal exposure processing (as described in ! 806: Section 7). Servers should be constructed in such a way as to virtually ! 807: guarantee sufficient memory (e.g., by caching images) for creating at least ! 808: small tiles and cursors, although this is not true in current implementations. ! 809: ! 810: @subsection(Graphics) ! 811: ! 812: All graphics and text requests include a logic function and a plane-select mask ! 813: (an integer with the same number of bits as a pixel value) to modify the ! 814: operation. All sixteen logic functions are provided. Given a source and ! 815: destination pixel, the function is computed bitwise on corresponding bits of ! 816: the pixels, but only on bits specified in the plane-select mask. Thus the ! 817: result pixel is computed as ! 818: @begin(format, leftmargin +5) ! 819: ((source FUNC destination) AND mask) OR (destination AND (NOT mask)) ! 820: @end(format) ! 821: The most common operation is simply replacing the destination with the source in ! 822: all planes. ! 823: ! 824: The simplest graphics request takes a single source pixel value and combines it ! 825: with every pixel in a rectangular region of a window. Typically this is used ! 826: to fill a region with a color, but by varying the logic function or masks, ! 827: other effects can be achieved. A second request takes a tile, effectively ! 828: constructs a tiled rectangular source with it, and then combines the source ! 829: with a rectangular region of a window. ! 830: ! 831: An arbitrary image can be displayed directly, without first being stored ! 832: off-screen. For monochrome images, the full contents of a bitmap are ! 833: transmitted, along with a pair of pixel values; the image is displayed in a ! 834: region of a window with those two colors. For color images, the full contents ! 835: of a pixmap can be transmitted and displayed. In order to avoid inordinate ! 836: buffer space in the server, very large images must be broken into sections on ! 837: the client side and displayed in separate requests. ! 838: ! 839: The CopyArea request allows one region of a window to be moved to (or combined ! 840: with) another region of the same window. This is the usual @i(bitblt), or "bit ! 841: block transfer" operation. The source and destination are given as rectangular ! 842: regions of the window; the two regions have the same dimensions. The operation ! 843: is such that overlap of the source and destination does not affect the result. ! 844: ! 845: X provides a complex primitive for line drawing. It provides for arbitrary ! 846: combinations of straight and curved segments, defining both open and closed ! 847: shapes. Lines can be @i(solid), by drawing with a single source pixel value, ! 848: @i(dashed), by alternately drawing with a single source pixel value and not ! 849: drawing, and @i(patterned), by alternately drawing with two source pixel ! 850: values. Lines are drawn with a rectangular brush. Clients can query the ! 851: server to determine what brush shapes are supported; a better semantics would ! 852: be to support arbitrary shapes, but allow applications to query as to which ! 853: shapes are most efficient. ! 854: ! 855: A final request allows an arbitrary closed shape (such as could be specified in ! 856: the line drawing request) to be filled with either a single source pixel value ! 857: or a tile. For self-intersecting shapes, the even-odd rule is used: a point is ! 858: inside the shape if an infinite ray with the point as origin crosses the path ! 859: an odd number of times. ! 860: ! 861: @subsection(Text) ! 862: ! 863: For high-performance text, X provides direct support for bitmap fonts. A font ! 864: consists of up to 256 bitmaps; each bitmap in a font has the same height but ! 865: can vary in width. To allow server-specific font representations, clients ! 866: "create" fonts by specifying a name rather than by downloading bitmap images ! 867: into the server. An application can use an arbitrary number of fonts, but (as ! 868: with all resources) font allocation can fail for lack of memory. A reasonably ! 869: implemented server should support an essentially unbounded number of fonts ! 870: (e.g., by caching), but some existing server implementations are deficient in ! 871: this respect. Unlike Andrew@cite(wm), no heuristics are applied by the server ! 872: when resolving a name to a font; specific communities or applications may ! 873: demand a variety of heuristics, and as such they belong outside the base window ! 874: system. Also unlike Andrew, the X server is not free to dynamically substitute ! 875: one font for another; we do not believe such behavior is necessary or ! 876: appropriate. ! 877: ! 878: A string of text can be displayed using a font either as a mask or as a source. ! 879: Using a font as a mask, the foreground (the one bits in the bitmap) of each ! 880: character is drawn with a single source pixel value. Using a font as a source, ! 881: the entire image of each character is drawn, using a pair of pixel values. ! 882: Source font output is provided specifically for applications using fixed-width ! 883: fonts in emulating traditional terminals. ! 884: ! 885: To support "cut and paste" operations between applications, the server provides ! 886: a number of buffers into which a client can read and write an arbitrary string ! 887: of bytes. (This mechanism was adopted from Andrew.) Although these buffers ! 888: are used principally for text strings, the server imposes no interpretation on ! 889: the data, so cooperating applications can use the buffers to exchange such ! 890: things as resource identifiers and images. ! 891: ! 892: @section(Exposures) ! 893: ! 894: Given that output to obscured windows is possible, the issue of @i(exposure) ! 895: must be addressed. When all (or a piece) of an obscured window again becomes ! 896: visible (for example, as the result of the window being raised), is the client ! 897: or the server responsible for restoring the contents of the window? In X, it ! 898: is the responsibility of the client. When a region of a window becomes ! 899: exposed, the server sends an asynchronous event to the client, specifying the ! 900: window and the region that has been exposed; the rest is up to the application. ! 901: A trivial application might simply redraw the entire window; a more ! 902: sophisticated application would only redraw the exposed region. ! 903: ! 904: Why is the client responsible? Because X imposes no structure on, or ! 905: relationships between, graphics operations from a client, there are only two ! 906: basic mechanisms by which the server might restore window contents: by ! 907: maintaining display lists, and by maintaining off-screen images. In the first ! 908: approach, the server essentially retains a list of all output requests ! 909: performed on the window. When a region of the window becomes exposed, the ! 910: server either re-executes all requests to the entire window, or only ! 911: re-executes requests that affect the region while clipping the output to that ! 912: region. In the alternative approach, when a window becomes obscured the server ! 913: saves the obscured region (or perhaps the entire window) in off-screen memory. ! 914: All subsequent output requests are executed not only to the visible regions of ! 915: the window, but to the off-screen image as well. When an obscured region ! 916: becomes visible again, the off-screen copy is simply restored. ! 917: ! 918: We believe neither server-based approach is acceptable. With display lists, ! 919: the server is unlikely to have any reasonable notion of when later output ! 920: requests nullify earlier ones. Either the display list becomes unmanageably ! 921: long, and a refresh that should appear nearly instantaneous instead appears as ! 922: a slow-motion replay, or the server spends a significant length of time pruning ! 923: the display list, and normal-case performance is considerably reduced. One ! 924: problem with the off-screen image approach is (virtual) memory consumption: on ! 925: a 1024 by 1024 8-plane display, just one full-screen image requires one ! 926: megabyte of storage, and multiple overlapping windows could easily require many ! 927: times that amount. Another problem is that the cost of the implementation can ! 928: be prohibitive. Consider, for example, the QDSS display@cite(qdss), which has ! 929: a graphics co-processor. In the QDSS, display memory is inaccessible to the ! 930: host processor. In addition, the co-processor cannot perform operations in ! 931: host memory, and has relatively little off-screen memory of its own. The only ! 932: viable way to maintain off-screen images for displays like the QDSS may be to ! 933: emulate the co-processor in software. It can easily take tens of thousands of ! 934: lines of code to emulate a co-processor, and such emulation may execute orders ! 935: of magnitude slower than the co-processor. ! 936: ! 937: Our belief is that many applications can take advantage of their own ! 938: information structures to facilitate rapid redisplay, without the expense of ! 939: maintaining a distinct display structure or backing-store in the client or the ! 940: server, and often with even better performance. (Sapphire@cite(sapphire) ! 941: permits client refresh for this reason.) For example, a text editor can ! 942: redisplay directly from the source, and a VLSI editor can redisplay directly ! 943: from the layout and component definitions. Many applications will be built on ! 944: top of high-level graphics libraries that automatically maintain the data ! 945: structures necessary to implement rapid redisplay. For example, the structured ! 946: display file mechanism in VGTS could be supported in a client library. Of ! 947: course, pushing the responsibility back on the application may not simplify ! 948: matters, particularly when retrofitting old systems to a new environment. For ! 949: example, the current GKS design does not provide adequate hooks for automatic, ! 950: system-generated refresh of application windows, nor does it provide an ! 951: adequate mechanism for forcing refresh back on the application. ! 952: ! 953: Relying on client-controlled refresh also derives from window management ! 954: philosophy. Our belief is that applications cannot be written with fixed ! 955: top-level window sizes built in. Rather, they must function correctly with ! 956: almost any size, and continue to function correctly as windows are dynamically ! 957: resized. This is necessary if applications are to be usable on a variety of ! 958: displays under a variety of window management policies. (Of course, an ! 959: application may need a minimum size to function reasonably, and may prefer the ! 960: width or height to be a multiple of some number; X allows the client to attach ! 961: a resize hint to each window to inform window managers of this.) Our belief is ! 962: that most applications, for one reason or another, will already have code for ! 963: performing a complete redisplay of the window, and that it is usually ! 964: straightforward to modify this code to deal with partial exposures. Similar ! 965: arguments were used in the design of both Andrew and Mex, and experience has ! 966: confirmed their decision@cite(wm,mex). ! 967: ! 968: This is not to argue that the server should never maintain window contents, ! 969: only that it should not be @i(required) to maintain contents. For complex ! 970: imaging and graphics applications, efficient maintenance by the server may be ! 971: critical for acceptable performance of window management functions. There is ! 972: nothing inherent in the X protocol that precludes the server from maintaining ! 973: window contents and not generating exposure events. In the next version of X, ! 974: windows will have several attributes to advise the server as to when and how ! 975: contents should be maintained. ! 976: ! 977: In X, clients are never informed of what regions are obscured, only of what ! 978: regions have become visible. Thus, clients have insufficient information to ! 979: try and optimize output by only drawing to visible regions. However, we feel ! 980: this is justified on two grounds. First, realistically, users seldom stack ! 981: windows such that the active ones are obscured, so there is little point in ! 982: complicating applications to optimize this case. More importantly, allowing ! 983: applications to restrict output to only visible regions would conflict with the ! 984: desire to have the server maintain obscured regions automatically when ! 985: possible. ! 986: ! 987: An interesting complication with the CopyArea request (described in Section 6) ! 988: arises, having decided on client refresh. If part of the source region of the ! 989: CopyArea is obscured, then not all of the destination region can be updated ! 990: properly, and the client must be notified (with an exposure event) so that it ! 991: can correct the problem. Since output requests are asynchronous, care must be ! 992: taken by the application to handle exposure events when using CopyArea. In ! 993: particular, if a region is exposed and an event sent by the server, a ! 994: subsequent CopyArea may move all or part of the region before the event is ! 995: actually received by the application. Several simple algorithms have been ! 996: designed to deal with this situation, but we will not present them here. ! 997: ! 998: Client refresh raises a visual problem in a network environment. When a region ! 999: of a window becomes exposed, what contents should the server initially place in ! 1000: that window? In a local, tightly-coupled environment, it might be perfectly ! 1001: reasonable to leave the contents unaltered, because the client can almost ! 1002: instantaneously begin to refresh the region. In a network environment however ! 1003: (and even in a local system where processes can get "swapped out" and take ! 1004: considerable time to swap back in), inevitable delays can lead to visually ! 1005: confusing results. For example, the user may move a window, and see two images ! 1006: of the window on the screen for a significant length of time, or resize a ! 1007: window and see no immediate change in the appearance of the screen. ! 1008: ! 1009: To avoid such anomalies in X, clients must define a @i(background) for every ! 1010: window. The background can be a single color, or it can be a tiling pattern. ! 1011: Whenever a region of a window is exposed, the server immediately paints the ! 1012: region with the background. Users therefore see window shapes immediately, ! 1013: even if the "contents" are slow to arrive. Of course, many application windows ! 1014: have some notion of a background anyway, so having the server initialize with a ! 1015: background seldom results in extraneous redisplay. In fact, many non-leaf ! 1016: windows typically contain nothing but a background, and having the server paint ! 1017: that background frees the applications from performing any redisplay at all to ! 1018: those windows. ! 1019: ! 1020: Although we believe client-generated refresh is acceptable most of the time, it ! 1021: does not always perform well with momentary pop-up menus, where speed is at a ! 1022: premium. To avoid potentially expensive refresh when a menu is removed from ! 1023: the screen, a client can explicitly copy the region to be covered by the menu ! 1024: into off-screen memory (within the server) before mapping the menu window. A ! 1025: special unmap request is used to remove the menu: it unmaps the window without ! 1026: affecting the contents of the screen or generating exposure events. The ! 1027: original contents are then copied back onto the screen. In addition, the ! 1028: client usually @i(grabs) the server for the entire sequence, using a request ! 1029: which freezes all other clients until a corresponding ungrab request is issued ! 1030: (or the grabbing client terminates). Without this, concurrent output from ! 1031: other clients to regions obscured by the menu would be lost. Although freezing ! 1032: other clients is in general a poor idea, it seems acceptable for momentary ! 1033: menus. ! 1034: ! 1035: @section(Input) ! 1036: ! 1037: We now turn to a discussion of input events, but first we briefly describe the ! 1038: support for mouse cursors. Clients can define arbitrary shapes for use as ! 1039: mouse cursors. A cursor is defined by a source bitmap, a pair of pixel values ! 1040: with which to display the bitmap, a mask bitmap which defines the precise shape ! 1041: of the image, and a coordinate within the source bitmap which defines the ! 1042: "center" or "hot spot" of the cursor. Cursors of arbitrary size can be ! 1043: constructed, although only a portion of the cursor may be displayed on some ! 1044: hardware. Clients can query the server to determine what cursor sizes are ! 1045: supported, but existing applications typically just assume a 16 by 16 image can ! 1046: always be displayed. Cursors also can be constructed from character images in ! 1047: fonts; this provides a simple form of named indirection, allowing custom ! 1048: tailoring to each display without having to modify the applications. ! 1049: ! 1050: A window is said to @i(contain) the mouse if the hot spot of the cursor is ! 1051: within a visible portion of the window or one of its subwindows. The mouse is ! 1052: said to be @i(in) a window if the window contains the mouse but no subwindow ! 1053: contains the mouse. Every window can have a mouse cursor defined for it. The ! 1054: server automatically displays the cursor of whatever window the mouse is ! 1055: currently in; if the window has no cursor defined, the server displays the ! 1056: cursor of the closest ancestor with a cursor defined. ! 1057: ! 1058: Input is associated with windows. Input to a given window is controlled by a ! 1059: single client, which need not be the client that created the window. Events ! 1060: are classified into various types, and the controlling client selects which ! 1061: types are of interest to it. Only events matching in type with this selection ! 1062: are sent to the client. When an input event is generated for a window and the ! 1063: controlling client has not selected that type, the server @i(propagates) the ! 1064: event to the closest ancestor window for which some client has selected the ! 1065: type, and sends the event to that client instead. Every event includes the ! 1066: window that had the event type selected; this window is called the @i(event ! 1067: window). If the event has been propagated, the event also includes the next ! 1068: window down in the hierarchy between the event window and the original window ! 1069: on which the event was generated. ! 1070: ! 1071: @subsection(The Keyboard) ! 1072: ! 1073: For the keyboard, a client can selectively receive events on the press or ! 1074: release of a key. Keyboard events are not reported in terms of ASCII character ! 1075: codes; instead, each key is assigned a unique code, and client software must ! 1076: translate these codes into the appropriate characters. The mapping from ! 1077: keycaps to keycodes is intended to be "universal" and predefined; a given ! 1078: keycap has the same keycode on all keyboards. Applications generally have been ! 1079: written to read a "keymap file" from the user's home directory, so that users ! 1080: can remap the keyboard as they see fit. ! 1081: ! 1082: The use of coded keys is secondary to the ability to detect both up and down ! 1083: transitions on the keyboard. For example, a common trick in window systems is ! 1084: for mouse button operations to be affected by keyboard @i(modifiers) such as ! 1085: the Shift, Control, and Meta keys. A useful feature of the Genera@cite(genera) ! 1086: system is the use of a "mouse documentation line", which changes dynamically as ! 1087: modifiers are pressed and released, indicating the function of the mouse ! 1088: buttons. A base window system must provide this capability. Transitions are ! 1089: not only useful on modifiers; various applications for systems other than X ! 1090: have been designed to use "chords" (groups of keys pressed simultaneously), and ! 1091: again the window system should support them. ! 1092: ! 1093: The keyboard is always @i(attached) to some window (typically the root window ! 1094: or a top-level window); we call this window the @i(focus) window. A request ! 1095: can be used (usually by the input manager) to attach the keyboard to any ! 1096: window. The window that receives keyboard input depends on both the mouse ! 1097: position and the focus window. If the mouse is in some descendant of the focus ! 1098: window, that descendant receives the input. If the mouse is not in a ! 1099: descendant of the focus window, then the focus window receives the input, even ! 1100: if the mouse is outside the focus window. For applications that wish to have ! 1101: the mouse state modify the effect of keyboard input, a keyboard event contains ! 1102: the mouse coordinates, both relative to the event window and global to the ! 1103: screen, as well as the state of the mouse buttons. ! 1104: ! 1105: To provide a reasonable user interface, keyboard events also contain the state ! 1106: of the most common modifier keys: Shift, ShiftLock, Control, and Meta. ! 1107: Without this information, anomalous behavior can result. If the user switches ! 1108: windows while modifier keys are down, the new client must somehow determine ! 1109: which modifiers are down. Placing the modifier state in the keyboard events ! 1110: solves such problems, and also has another benefit: most clients do not have ! 1111: to maintain their own shadow of the modifier state, and so often can completely ! 1112: ignore key release events. However, there is a conflict between this ! 1113: server-maintained state and client-maintained keyboard mappings. In ! 1114: particular, clients cannot use non-standard keys as modifiers, or use chords ! 1115: without the possibility of anomalies such as described above. We believe the ! 1116: correct solution (not yet supported in X) is for the server to maintain a bit ! 1117: mask reflecting the full state of the keyboard, and to allow clients to read ! 1118: this mask. An application using chords or non-standard modifiers would request ! 1119: the server to send this mask automatically whenever the mouse entered the ! 1120: application's window. ! 1121: ! 1122: @subsection(The Mouse) ! 1123: ! 1124: The X protocol is (somewhat arbitrarily) designed for mice with up to three ! 1125: buttons. An application can selectively receive events on the press or release ! 1126: of each button. Each event contains the current mouse coordinates (both local ! 1127: to the window and global to the screen), the current state of all buttons and ! 1128: modifier keys, and a timestamp which can be used, for example, to decide when a ! 1129: succession of clicks constitutes a double or triple click. An application can ! 1130: also choose to receive mouse motion events, either whenever the mouse is in the ! 1131: window, or only when particular buttons have also been pressed. The ! 1132: application cannot control the granularity of the reporting, nor is any minimum ! 1133: granularity guaranteed. In fact, typical server implementations make an effort ! 1134: to compact motion events, to minimize system overhead and wired memory in ! 1135: device drivers. As such, X may not serve adequately for fine-grained tracking, ! 1136: such as in fast moving free-hand drawing applications. ! 1137: ! 1138: Even with motion compaction, servers can generate considerable numbers of ! 1139: motion events. If an application attempts to respond in real time to every ! 1140: event, it can easily get far behind relative to the actual position of the ! 1141: mouse. Instead, many applications simply treat motion events as hints. When a ! 1142: motion event is received, the event is simply discarded, and the client then ! 1143: explicitly queries the server for the current mouse position. In waiting for ! 1144: the reply, more motion events may be received; these are also discarded. The ! 1145: client then reacts based on the queried mouse position. The advantage of this ! 1146: scheme over continuously polling the mouse position is that no CPU time is ! 1147: consumed while the mouse is stationary. ! 1148: ! 1149: Clients can also receive an event each time the mouse enters or leaves a ! 1150: window. This can be particularly useful in implementing menus. For example, ! 1151: each menu item can be placed in a separate subwindow of the overall menu ! 1152: window. When the mouse enters a subwindow, the item is highlighted in some ! 1153: fashion (e.g., by inverting the video sense), and when the mouse leaves the ! 1154: window the item is restored to normal. Implementing a menu in this manner ! 1155: requires considerably less CPU overhead than continuous polling of the mouse, ! 1156: and also less overhead than using motion events, since most motion events would ! 1157: be within windows and thus uninteresting. ! 1158: ! 1159: Due to the nature of overlapping windows, and because continuous tracking by ! 1160: the server is not guaranteed, the mouse may appear to move instantaneously ! 1161: between any pair of windows on the screen. Certainly the window the mouse was ! 1162: in should be notified of the mouse leaving, and the window the mouse is now in ! 1163: should be notified of the mouse entering. However, all of the windows "in ! 1164: between" in the hierarchy may also be interested in the transition. This is ! 1165: useful in simplifying the structure of some applications, and is necessary in ! 1166: implementing certain kinds of window managers and input managers. Thus, when ! 1167: the mouse moves from window A to window B, with window W as their closest ! 1168: (least) common ancestor, all ancestors of A below W also receive leave events, ! 1169: and all ancestors of B below W receive enter events. ! 1170: ! 1171: Except for mouse motion events, it might be argued that events are infrequent ! 1172: enough that the server should always send all events to the client, and ! 1173: eliminate the complexity of selecting events. However, some applications are ! 1174: written with interrupt-driven input; events are received asynchronously, and ! 1175: cause the current computation to be suspended so that the input can be ! 1176: processed. For example, a text editor might use interrupt-driven input, with ! 1177: the normal computation being redisplay of the window. The receipt of ! 1178: extraneous input events (for example, key release events) can cause noticeable ! 1179: "hiccups" in such redisplay. ! 1180: ! 1181: @section(Input and Window Management) ! 1182: ! 1183: There are two basic modes of keyboard management: @i(real-estate) and ! 1184: @i(listener). In real-estate mode, the keyboard "follows" the mouse; keyboard ! 1185: input is directed to whatever window the mouse is in. In listener mode, ! 1186: keyboard input is directed to a specific window, independent of the mouse ! 1187: position. Some systems provide only real-estate mode@cite(apollo,sunwin), some ! 1188: only listener mode@cite(lucasfilm,sapphire,pnx,mex,mg1,genera), and ! 1189: Andrew@cite(wm) provides both, although the mode cannot be changed during a ! 1190: session. Both modes are supported in X, and the mode can be changed ! 1191: dynamically. Real-estate mode is the default behavior, with the root window as ! 1192: the focus window, as described in the previous section. An input manager can ! 1193: also make some other (typically top-level) window the focus window, yielding ! 1194: listener mode. Note however, that in listener mode in X, the client ! 1195: controlling the focus window can still get real-estate behavior for subwindows, ! 1196: if desired; this capability has proven useful in several applications. ! 1197: ! 1198: The primary function of a window manager is reconfiguration: restacking, ! 1199: resizing, and repositioning top-level windows. The configuration of nested ! 1200: windows is assumed to be application-specific, and under control of the ! 1201: applications. There are two broad categories of window managers: manual and ! 1202: automatic. A manual window manager is "passive", and simply provides an ! 1203: interface to allow the user to manipulate the desktop; windows can be resized ! 1204: and reorganized at will. The initial size and position of a window typically ! 1205: (but not always) is under user or application control. Automatic window ! 1206: managers are "active", and operate for the most part without human interaction; ! 1207: size and position at window creation, and reconfiguration at window ! 1208: destruction, are chosen by the system. Automatic managers typically tile the ! 1209: screen with windows, such that no two windows overlap, automatically adjusting ! 1210: the layout as windows are created and destroyed. Andrew@cite(wm), ! 1211: Star@cite(star2), and Cedar@cite(cedar) provide automatic management, plus ! 1212: limited manual reconfiguration capability. ! 1213: ! 1214: Existing window managers for X are manual. Automatic management that is ! 1215: transparent to applications cannot be accomplished reasonably in X; future ! 1216: support for automatic management is discussed in Section 10. In the current X ! 1217: design, clients are responsible for initially sizing and placing their ! 1218: top-level windows, not window managers. In this way, applications continue to ! 1219: work when no window manager is present. Typically, the user either specifies ! 1220: geometry information in the application command line, or uses the mouse to ! 1221: sweep out a rectangle on the screen. (For the latter, the application grabs ! 1222: the mouse, as described below.) ! 1223: ! 1224: @subsection(Mouse-Driven Management) ! 1225: ! 1226: Existing managers are primarily mouse-driven, and are based on the ability to ! 1227: "steal" events. Specifically, a manager (or any other client) can @i(grab) a ! 1228: mouse button in combination with a set of modifier keys, with the following ! 1229: effect. Whenever the modifier keys are down and the button is pressed, the ! 1230: event is reported to the grabbing client, regardless of what window the mouse ! 1231: is in. All mouse-related events continue to be sent to that client until the ! 1232: button is released. As part of the grab, the client also specifies a mouse ! 1233: cursor to be used for the duration of the grab, and a window to be used as the ! 1234: event window. A manager specifies the root window as the event window when ! 1235: grabbing buttons; with the event propagation semantics described in Section 8, ! 1236: the grabbed events contain not only the global mouse coordinates, but also the ! 1237: top-level application window (if any) containing the mouse. This is sufficient ! 1238: information to manipulate top-level windows. ! 1239: ! 1240: Using this button-grab mechanism, several different management interfaces have ! 1241: been built, including a "programmable" interface@cite(uwm) allowing the user to ! 1242: assign individual commands or user-defined menus of commands to any number of ! 1243: button/modifier combinations. For example, a button click (press and release ! 1244: without intervening motion) might be interpreted as a command to raise or lower ! 1245: a window, or to attach the keyboard; a press/motion/release sequence might be ! 1246: interpreted as a command to move a window to a new position; or a button press ! 1247: might cause a menu to pop up, with the selection indicated by the mouse ! 1248: position at the release of the button. By allowing both specific commands and ! 1249: menus to be bound to buttons, a range of interfaces can be constructed to ! 1250: satisfy both "expert" and "novice" users. ! 1251: ! 1252: Another form of manager simply displays a static menu bar along the top of the ! 1253: screen, with items for such operations as moving a window and attaching the ! 1254: keyboard. The menu is used in combination with a mouse-grab primitive, with ! 1255: which a client can unilaterally grab the mouse and then later explicitly ! 1256: release it; during such a mouse-grab, events are redirected to the grabbing ! 1257: client, just as for button-grabs. When the user clicks on a menu bar item with ! 1258: any button, the manager unilaterally grabs the mouse. The user then uses the ! 1259: mouse to execute the specific command. For example, having clicked on the ! 1260: "move" item, the user indicates the window to move by placing the mouse in the ! 1261: window and pressing a button, then indicates the new position by moving the ! 1262: mouse and releasing the button. The manager then releases the mouse. ! 1263: ! 1264: @subsection(Icons) ! 1265: ! 1266: One important "resizing" operation performed by a window manager is ! 1267: transforming a window into a small icon and back again. In X, icons are merely ! 1268: windows. Transforming a window into an icon simply involves unmapping the ! 1269: window and mapping its associated icon. The association between a window and ! 1270: its icon is maintained in the server, rather than the window manager, and ! 1271: either the application or the manager can provide the icon. In this way, the ! 1272: manager can provide a default icon form for most clients, but clients can ! 1273: provide their own if desired, possibly with dynamic rather than static ! 1274: contents. The client is still insulated from management policy, even if it ! 1275: provides the icon: the manager is responsible for positioning, mapping, and ! 1276: unmapping the icon, and the client is responsible only for displaying the ! 1277: contents. ! 1278: ! 1279: The icon state is maintained in the server not only to allow clients to provide ! 1280: icons, but to avoid the loss of state if the window manager should terminate ! 1281: abnormally. When a window manager terminates, any windows it has created are ! 1282: destroyed, including icon windows. With knowledge of icons, the server can ! 1283: detect when an icon is destroyed, and automatically remap the associated client ! 1284: window. Without this, abnormal termination of the window manager would result ! 1285: in "lost" windows. ! 1286: ! 1287: @subsection(Race Conditions) ! 1288: ! 1289: There are many race conditions that must be dealt with in input and window ! 1290: management, due to the asynchronous nature of event handling. For example, if ! 1291: a manager attempts to grab the mouse in response to a press of a button, the ! 1292: mouse-grab request might not reach the server until after the button is ! 1293: released, and intervening mouse events would be missed. Or, if the user clicks ! 1294: on a window to attach the keyboard there, and then immediately begins typing, ! 1295: the first few keystrokes might occur before the manager actually responds to ! 1296: the click and the server actually moves the keyboard focus. A final example is ! 1297: a simple interface in which clicking on a window lowers it. Given a stack of ! 1298: three windows, the user might rapidly click twice in the same spot, expecting ! 1299: the top two windows to be lowered. Unless the first click is sent to the ! 1300: manager and the resulting request to lower is processed by the server before ! 1301: the second click takes place, the event window for the second click will be the ! 1302: same as for the first click, and the manager will lower the first window twice. ! 1303: ! 1304: A work-around for the last example, used by existing managers, is to ignore the ! 1305: event window reported in most events. Instead, the global mouse coordinates ! 1306: reported in the event are used in a follow-up query request to determine which ! 1307: top-level window now contains that coordinate. However, not all race ! 1308: conditions have acceptable solutions within the current X design. For a ! 1309: general solution, it must be possible for the manager to synchronize operations ! 1310: explicitly with event processing in the server. For example, a manager might ! 1311: specify that, at the press of a button, event processing in the server should ! 1312: cease until an explicit acknowledgment is received from the manager. ! 1313: ! 1314: @section(Future) ! 1315: ! 1316: Based on critiques from numerous universities and commercial firms, a fairly ! 1317: extensive evaluation and redesign of the X protocol has been underway since May ! 1318: 1986. Our desire is to define a "core" protocol that can serve as a standard ! 1319: for window system construction over the next several years. We expect to ! 1320: present the rationale for this new design in the very near future, once it has ! 1321: been validated by at least a preliminary implementation. In this section, we ! 1322: highlight the major protocol changes. ! 1323: ! 1324: @subsection(Resource Allocation) ! 1325: ! 1326: Since the server is responsible for assigning identifiers to resources, each ! 1327: resource allocation currently requires a round-trip time to perform. For ! 1328: applications that allocate many resources, this causes a considerable start-up ! 1329: delay. For example, a multi-pane menu might consist of dozens of windows, ! 1330: numerous fonts, and several different mouse cursors, leading to a delay of one ! 1331: second or longer. ! 1332: ! 1333: In retrospect, this is the most significant defect in the design of X. To get ! 1334: around these delays, programming interfaces have been augmented to provide ! 1335: "batch mode" operations. If several resources must be created, but there are ! 1336: no inter-dependencies among the allocation requests, all of the requests are ! 1337: sent in a batch, and then all of the replies are received. This effectively ! 1338: reduces the delay to a single round-trip time. ! 1339: ! 1340: A better solution to this problem is to make clients generate the identifiers. ! 1341: When the client establishes a connection to the server, it is given a specific ! 1342: subrange from which it can allocate. This change will significantly improve ! 1343: start-up times without affecting applications, as identifiers can be generated ! 1344: inside low-level libraries without changing programming interfaces. ! 1345: ! 1346: @subsection(Transparent Windows) ! 1347: ! 1348: One use of transparent windows is as clipping regions. However, they are ! 1349: unsatisfactory for this purpose because every coordinate in a graphics request ! 1350: must be translated by the client from the "real" window's origin to the ! 1351: transparent window's origin. A better approach to clipping regions is to allow ! 1352: clients to create clipping regions and attach them to all graphics requests. ! 1353: As noted in Section 6, X currently allows a clipping region in the form of a ! 1354: bitmap to be attached to a few graphics requests. Allowing a clipping region, ! 1355: specified either as a bitmap or a list of rectangles, to be attached to all ! 1356: graphics requests provides a more uniform mechanism. ! 1357: ! 1358: The major use of transparent windows to date is actually as inexpensive opaque ! 1359: windows. In the current server implementation, transparent windows can be ! 1360: created and transformed significantly faster than opaque windows. Because of ! 1361: this, transparent windows are often used when opaque windows would otherwise be ! 1362: adequate. We believe a new implementation of the server will improve the ! 1363: performance of opaque windows to the point that this will no longer be ! 1364: necessary. ! 1365: ! 1366: With explicit clipping regions added for graphics, and the performance ! 1367: advantages of transparent windows reduced, the only remaining use of ! 1368: transparent windows is for input (and cursor) control. Various applications ! 1369: want relatively fine-grained input control, and such control must not affect ! 1370: graphics output. Close control of cursor images and mouse motion events seems ! 1371: particularly important. However, the vast majority of the time control ! 1372: naturally is associated with normal window boundaries, so it would be unwise to ! 1373: divorce input control completely from windows. As such, the new protocol ! 1374: provides "input-only" windows, which act like normal windows for the purposes ! 1375: of input and cursor control, but which cannot be used as a source or ! 1376: destination in graphics requests, and which are completely invisible as far as ! 1377: output is concerned. ! 1378: ! 1379: @subsection(Color) ! 1380: ! 1381: X originally was not designed to deal with direct-color displays. Direct-color ! 1382: displays typically have between 12 and 36 bits per pixel; the pixel value ! 1383: consists of three subfields, which are used as indexes into three independent ! 1384: color maps: one for red intensities, one for green, and one for blue. Some ! 1385: direct-color displays also have a fourth subfield, sometimes referred to as ! 1386: "z-channel" information, used to control attributes such as blending or chroma ! 1387: keying. We now understand how to incorporate direct-color displays without ! 1388: z-channel information into X, in such a way that the differences between ! 1389: direct-color and pseudo-color color maps need not be apparent to the ! 1390: application, yet still allowing all of the usual color map tricks to played. ! 1391: ! 1392: At present there is only one color map for all applications, and color ! 1393: applications fail when this map gets full. Although dozens of applications ! 1394: typically can be run under X within a single 8-bit pseudo-color map, a single ! 1395: map is clearly unacceptable when dealing with small color maps, or with ! 1396: multiple applications (e.g., CAD tools) that need large portions of the color ! 1397: map. The solution is to support multiple virtual color maps, still permitting ! 1398: applications to coexist within any map, but allowing the possibility that not ! 1399: all applications show true color simultaneously. This also matches ! 1400: next-generation displays, which actually support multiple color maps in ! 1401: hardware@cite(rainbow). ! 1402: ! 1403: @subsection(Graphics) ! 1404: ! 1405: Perhaps the biggest mistake in the graphics area was failing to support fonts ! 1406: with kerning (side bearings). For example, a relatively complete emulation of ! 1407: the Andrew programming interface was built for X, but Andrew applications ! 1408: depend heavily on kerned fonts. There are other deficiencies that will be ! 1409: corrected. For example, large glyph-sets (e.g., Japanese) will be supported, ! 1410: as well as stippling (using a clip mask constructed by tiling a region with a ! 1411: bitmap). The notions of line width, join style, and end style found in ! 1412: PostScript@cite(postscript) are usually preferred to brush shapes for line ! 1413: drawing, and will be supported. ! 1414: ! 1415: In an attempt to support a wide range of devices, the exact path followed for ! 1416: lines and filled shapes was originally left undefined in X (the class of curve ! 1417: was not even specified). Different devices use slightly different algorithms ! 1418: to draw straight lines, and it seemed better to have high performance with ! 1419: minor variation than to have uniformity with poor performance. Relatively few ! 1420: devices support curve drawing in hardware, but some support it in firmware, and ! 1421: again performance seemed more important than accuracy. In retrospect, however, ! 1422: allowing such device dependent behavior was a poor decision. The vast majority ! 1423: of applications draw lines aligned on an axis, and speed and precision are not ! 1424: an issue. The applications that do require complex shapes also require ! 1425: predictable results, so precise specifications are important. ! 1426: ! 1427: A notable feature missing in X is the ability to perform graphics operations ! 1428: off screen. The reasons for this are essentially the same as those presented ! 1429: when discussing exposures in Section 7. In particular, not all graphics ! 1430: co-processors can operate on host memory, and emulating such processors can be ! 1431: expensive. However, application builders have demanded this capability, and ! 1432: the demand appears to be sufficient leverage to convince server implementors to ! 1433: provide the capability. Off-screen graphics will be possible in the new ! 1434: protocol, although the amount of off-screen memory and its performance ! 1435: characteristics may vary widely. In addition, the protocol is being extended ! 1436: to allow the manipulation of both images and windows of varying depths. For ! 1437: example, a server might support depths of 1, 4, 8, 12, and 24 bits. This ! 1438: allows imaging applications to transmit data more compactly, allows for more ! 1439: efficient memory utilization in the server, and provides a match with ! 1440: next-generation display hardware. ! 1441: ! 1442: A common debate in graphics systems is whether and where to have state. Should ! 1443: parameters such as logic function, plane mask, source pixel value or tile, ! 1444: tiling origin, font, line width and style, and clipping region be explicit in ! 1445: every request or collected into a state object? The current X protocol is ! 1446: stateless, for the following reasons: both state and stateless programming ! 1447: interfaces can be built easily on top of the protocol; the currently supported ! 1448: graphics requests have just few enough parameters that they can be represented ! 1449: compactly; and the initial set of displays we were interested in (and the ! 1450: implementations we had in mind for them) would not benefit from the addition of ! 1451: state. However, we now believe that a state-based protocol is generally ! 1452: superior, as it handles complex graphics gracefully and allows significantly ! 1453: faster implementations on some displays. ! 1454: ! 1455: @subsection(Management) ! 1456: ! 1457: An obvious interface style presently not supported in X is the ability to use ! 1458: the keyboard for management commands. To allow this, a key-grab mechanism, ! 1459: akin to the button-grab mechanism described in Section 9, will be provided. To ! 1460: allow such styles as using the first button click in a window to attach the ! 1461: keyboard, both button-grabs and key-grabs have been extended to apply to ! 1462: specific sub-hierarchies, rather than always to the entire screen. To handle ! 1463: the kinds of race conditions described in Section 9, a general event ! 1464: synchronization mechanism has been incorporated into the grab mechanisms. ! 1465: ! 1466: To support automatic window management, a manager must be able to intercept ! 1467: certain management requests from clients (such as mapping or moving a window) ! 1468: before they are executed by the server, and to be notified about others (such ! 1469: as unmapping a window) after they are executed. In addition, some managers ! 1470: want to provide uniform title bars and border decorations automatically. To ! 1471: allow this, it is useful to be able to "splice" hierarchies: to move a window ! 1472: from one parent to another. To allow input managers and window managers to be ! 1473: implemented as separate applications, the ability for multiple clients to ! 1474: select events on the same window is being added. For example, both a window ! 1475: manager and an input manager might be interested in the unmapping or ! 1476: destruction of a window. ! 1477: ! 1478: @subsection(Extensibility) ! 1479: ! 1480: The information that input and window managers might desire from applications ! 1481: is quite varied, and it would be a mistake to try and define a fixed set. ! 1482: Similarly, the information paths between applications (e.g., in support of "cut ! 1483: and paste") need to be flexible. To this end, we are adding a Lisp-ish ! 1484: property list@cite(CLtL) mechanism to windows, and the event mechanism is being ! 1485: augmented to provide a simple form of inter-client communication. ! 1486: ! 1487: The new X protocol explicitly continues to avoid certain areas, such as 3-D ! 1488: graphics and anti-aliasing. However, a general mechanism has been designed to ! 1489: allow extension libraries to be included in a server. The intention is that ! 1490: all servers implement the "core" protocol, but each server can provide ! 1491: arbitrary extensions. If an extension becomes widely accepted by the X ! 1492: community, it can be adopted as part of the core. Each extension library is ! 1493: assigned a global name, and an application can query the server at run-time to ! 1494: determine if a particular extension is present. Request opcodes and event ! 1495: types are allocated dynamically, so that applications need not be modified to ! 1496: execute in each new environment. ! 1497: ! 1498: @section(Summary) ! 1499: ! 1500: The X Window System provides high-performance, high-level, device-independent ! 1501: graphics. A hierarchy of resizable, overlapping windows allows a wide variety ! 1502: of application and user interfaces to be built easily. Network-transparent ! 1503: access to the display provides an important degree of functional separation, ! 1504: without significantly affecting performance, that is crucial to building ! 1505: applications for a distributed environment. To a reasonable extent, desktop ! 1506: management can be custom tailored to individual environments, without modifying ! 1507: the base system and typically without affecting applications. ! 1508: ! 1509: To date, the X design and implementation effort has focused on the base window ! 1510: system, as described in this paper, and in essential applications and ! 1511: programming interfaces. The design of the network protocol, the design and ! 1512: implementation of device-independent layer of server, and the implementation of ! 1513: several applications and a prototype window manager, were carried out by the ! 1514: first author. The design and implementation of the C programming interface, ! 1515: the implementation of major portions of several applications, and the ! 1516: coordination of efforts within Project Athena and Digital, were carried out by ! 1517: the second author. In addition, many other persons from Project Athena, the ! 1518: Laboratory for Computer Science, and institutions outside MIT have contributed ! 1519: software. ! 1520: ! 1521: Necessary applications such as window managers and VT100 and Tektronics 4014 ! 1522: terminal emulators have been created, and numerous existing applications, such ! 1523: as text editors and VLSI layout systems, have been ported to the X environment. ! 1524: Although several different menu packages have been implemented, we are only now ! 1525: beginning to see a rich library of tools (scroll bars, frames, panels, more ! 1526: menus, etc.) to facilitate the rapid construction of high-quality user ! 1527: interfaces. Tool building is taking place at many sites, and several ! 1528: universities are now attempting to unify window systems work with X as a base, ! 1529: so that such tools can be shared. ! 1530: ! 1531: The use of X has grown far beyond anything we had imagined. Digital has ! 1532: incorporated X into a commercial product, and other manufacturers are following ! 1533: suit. With the appearance of such products, and the release of complete X ! 1534: sources on the Berkeley 4.3 Unix distribution tapes, it is no longer feasible ! 1535: to track all X use and development. Existing applications written in C are ! 1536: known to have been ported to seven machine architectures of more than twelve ! 1537: manufacturers, and the C server to six machine architectures and more than ! 1538: sixteen display architectures. In most cases the code is running under Unix, ! 1539: but other operating systems are also involved. In addition, relatively ! 1540: complete server implementations exist in two Lisp dialects. Apart from ! 1541: designing the system to be portable, a large part of this success is due to ! 1542: MIT's decision to distribute X sources without any licensing restrictions, and ! 1543: the willingness of people in both educational and commercial institutions to ! 1544: contribute code without restrictions. ! 1545: ! 1546: @b(Acknowledgments) ! 1547: ! 1548: Our thanks go to the many people who have contributed to the success of X. ! 1549: Particular thanks go to those who have made significant contributions to the ! 1550: non-proprietary implementation: Paul Asente (Stanford University), Scott Bates ! 1551: (Brown University), Mike Braca (Brown), Dave Bundy (Brown), Dave Carver ! 1552: (Digital), Tony Della Fera (Digital), Mike Gancarz (Digital), James Gosling ! 1553: (Sun Microsystems), Doug Mink (Smithsonian Astrophysical Observatory), Bob ! 1554: McNamara (Digital), Ron Newman (MIT), Ram Rao (Digital), Dave Rosenthal (Sun), ! 1555: Dan Stone (Brown), Stephen Sutphen (University of Alberta), and Mark ! 1556: Vandevoorde (MIT). ! 1557: ! 1558: Special thanks go to Digital Equipment Corporation. A redesign of the protocol ! 1559: and a reimplementation of the server to deal with color and to increase ! 1560: performance was made possible with funding (in the form of hardware) from ! 1561: Digital. To their credit, all of the resulting device-independent code ! 1562: remained the property of MIT.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.