Annotation of 43BSDTahoe/new/X/doc/Paper/x.mss, revision 1.1

1.1     ! root        1: @device(postscript)
        !             2: @make(article)
        !             3: @style(references=cacm)
        !             4: @set(page=+1)
        !             5: 
        !             6: @majorheading(The X Window System)
        !             7: @center(Robert W. Scheifler@footnote( 545 Technology Square, Cambridge, MA 02139.)
        !             8: MIT Laboratory for Computer Science
        !             9: 
        !            10: Jim Gettys@footnote( Project Athena, MIT, Cambridge, MA 02139.)
        !            11: Digital Equipment Corporation
        !            12: MIT Project Athena
        !            13: 
        !            14: July 1986
        !            15: Revised October 1986@footnote( To appear in Transactions on Graphics #63, 
        !            16: Special Issue on User Interface Software, Copyright 1986, 
        !            17: Association for Computing Machinery. Permission to copy without fee all or
        !            18: part of this material is granted provided that the copies are not made or 
        !            19: distributed for direct commercial advantage, the ACM copyright notice and the 
        !            20: title of the publication and its date appear, 
        !            21: and notice is given that copying is by permission of the Association for
        !            22: Computing Machinery.
        !            23: To copy otherwise, or to republish requires a fee and/or specific permission.)
        !            24: 
        !            25: @blankspace(2 lines)
        !            26: 
        !            27: @begin(abstract)
        !            28: 
        !            29: An overview of the X Window System is presented, focusing on the system
        !            30: substrate and the low-level facilities provided to build applications and to
        !            31: manage the desktop.  The system provides high-performance, high-level,
        !            32: device-independent graphics.  A hierarchy of resizable, overlapping windows
        !            33: allows a wide variety of application and user interfaces to be built easily.
        !            34: Network-transparent access to the display provides an important degree of
        !            35: functional separation, without significantly affecting performance, that is
        !            36: crucial to building applications for a distributed environment.  To a
        !            37: reasonable extent, desktop management can be custom tailored to individual
        !            38: environments, without modifying the base system and typically without affecting
        !            39: applications.
        !            40: 
        !            41: Categories and Subject Descriptors:  C.2.2 [@b(Computer-Communication Networks)]:
        !            42: Network Protocols - @i(protocol architecture); C.2.4 [@b(Computer-Communication
        !            43: Networks)]: Distributed Systems - @i(distributed applications); D.4.4 [@b(Operating
        !            44: Systems)]: Communication Management - @i(network communication, terminal management);
        !            45: H.1.2 [@b(Information Systems)]: User/Machine Systems - @i(human factors); I.3.2
        !            46: [@b(Computer Graphics)]: Graphic Systems - @i(distributed/network graphics);
        !            47: I.3.4 [@b(Computer Graphics)]: Graphics Utilities - @i(graphics packages, software
        !            48: support); I.3.6 [@b(Computer Graphics)]: Methodology and Techniques - @i(device
        !            49: independence, interaction techniques)
        !            50: 
        !            51: General terms:  Design, Experimentation, Human Factors, Standardization
        !            52: 
        !            53: Additional Key Words and Phrases:  window systems, window managers, virtual terminals
        !            54: 
        !            55: @end(abstract)
        !            56: 
        !            57: @section(Introduction)
        !            58: 
        !            59: The X Window System (or simply X) developed at MIT has achieved fairly
        !            60: widespread popularity recently, particularly in the Unix@footnote( Unix is a
        !            61: trademark of AT&T Bell Laboratories.) community.  In this paper, we present an
        !            62: overview of X, focusing on the system substrate and the low-level facilities
        !            63: provided to build applications and to manage the desktop.  In X, this base
        !            64: window system provides high-performance graphics to a hierarchy of resizable
        !            65: windows.  Rather than mandating a particular user interface, X provides
        !            66: primitives to support several policies and styles.  Unlike most window systems,
        !            67: the base system in X is defined by a @i(network protocol):  asynchronous
        !            68: stream-based inter-process communication replaces the traditional procedure
        !            69: call or kernel call interface.  An application can utilize windows on any
        !            70: display in a network in a device-independent, network-transparent fashion.
        !            71: Interposing a network connection greatly enhances the utility of the window
        !            72: system, without significantly affecting performance.  The performance of
        !            73: existing X implementations is comparable to contemporary window systems, and in
        !            74: general is limited by display hardware rather than network communication.  For
        !            75: example, 19500 characters per second and 3500 short vectors per second are
        !            76: possible on Digital Equipment Corporation's VAXStation-II/GPX, both locally and
        !            77: over a local area network, and these figures are very close to the limits of
        !            78: the display hardware.
        !            79: 
        !            80: X is the result of the simultaneous need for a window system from two separate
        !            81: groups at MIT.  In the summer of 1984, the Argus system@cite(argus) at the
        !            82: Laboratory for Computer Science needed a debugging environment for
        !            83: multi-process distributed applications, and a window system seemed the only
        !            84: viable solution.  Project Athena@cite(athena) was faced with dozens, and
        !            85: eventually thousands of workstations with bitmap displays, and needed a window
        !            86: system to make the displays useful.  Both groups were starting with the Digital
        !            87: VS100 display@cite(vs100) and VAX hardware, but it was clear at the outset that
        !            88: other architectures and displays had to be supported.  In particular, equal
        !            89: numbers of IBM workstations with bitmap displays of unknown type were expected
        !            90: eventually within Project Athena.  Portability was therefore a goal from the
        !            91: start.  Although all of the initial implementation work was for Berkeley Unix,
        !            92: it was clear that the network protocol should not depend on aspects of the
        !            93: operating system.
        !            94: 
        !            95: The name X derives from the lineage of the system.  At Stanford University,
        !            96: Paul Asente and Brian Reid had begun work on the W window system@cite(w), as an
        !            97: alternative to VGTS@cite(vgts1,vgts2) for the V system@cite(v).  Both VGTS and
        !            98: W allow network-transparent access to the display, using the synchronous V
        !            99: communication mechanism.  Both systems provide "text" windows for ASCII
        !           100: terminal emulation.  VGTS provides graphics windows driven by fairly high-level
        !           101: object definitions from a structured display file; W provides graphics windows
        !           102: based on a simple display-list mechanism, with limited functionality.  We
        !           103: acquired a Unix-based version of W for the VS100 (with synchronous
        !           104: communication over TCP@cite(tcp)) done by Asente and Chris Kent at Digital's
        !           105: Western Research Laboratory.  From just a few days of experimentation, it was
        !           106: clear that a network-transparent hierarchical window system was desirable, but
        !           107: that restricting the system to any fixed set of application-specific modes was
        !           108: completely inadequate.  It was also clear that, although synchronous
        !           109: communication was perhaps acceptable in the V system (due to very fast
        !           110: networking primitives), it was completely inadequate in most other operating
        !           111: environments.  X is our "reaction" to W.  The X window hierarchy comes directly
        !           112: from W, although numerous systems have been built with hierarchy in at least
        !           113: some form@cite(lucasfilm,star1,lispm,sunwin,mg1,genera,cedar,metheus,tajo).
        !           114: The asynchronous communication protocol used in X is a significant improvement
        !           115: over the synchronous protocol used in W, but is very similar to that used in
        !           116: Andrew@cite(wm,andrew).  X differs from all of these systems in the degree to
        !           117: which both graphics functions and "system" functions are pushed back (across
        !           118: the network) as application functions, and in the ability to transparently
        !           119: tailor desktop management.
        !           120: 
        !           121: The next section presents several high-level requirements that we believe a
        !           122: window system must satisfy to be a viable standard in a network environment,
        !           123: and indicates where the design of X fails to meet some of these requirements.
        !           124: In Section 3 we describe the overall X system model, and the effect of
        !           125: network-based communication on that model.  Section 4 describes the structure
        !           126: of windows, and the primitives for manipulating that structure.  Section 5
        !           127: explains the color model used in X, and Section 6 presents the text and
        !           128: graphics facilities.  Section 7 discusses the issues of window exposure and
        !           129: refresh, and their resolution in X.  Section 8 deals with input event handling.
        !           130: In Section 9, we describe the mechanisms for desktop management.
        !           131: 
        !           132: This paper describes the version@footnote( Version 10.) of X that is currently
        !           133: in widespread use.  The design of this version is inadequate in several
        !           134: respects.  With our experience to date, and encouraged by the number of
        !           135: universities and manufacturers taking a serious interest in X, we have designed
        !           136: a new version that should satisfy a significantly wider community.  Section 10
        !           137: discusses a number of problems with the current X design, and gives a general
        !           138: idea of what changes are contemplated.
        !           139: 
        !           140: @section(Requirements)
        !           141: 
        !           142: A window system contains many interfaces.  A @i(programming) interface is a
        !           143: library of routines and types provided in a programming language for
        !           144: interacting with the window system.  Both low-level (e.g., line drawing) and
        !           145: high-level (e.g., menus) interfaces are typically provided.  An @i(application)
        !           146: interface is the mechanical interaction with the user and the visual appearance
        !           147: that is specific to the application.  A @i(management) interface is the
        !           148: mechanical interaction with the user dealing with overall control of the
        !           149: desktop and the input devices.  The management interface defines how
        !           150: applications are arranged and rearranged on the screen, and how the user
        !           151: switches between applications; an individual application interface defines how
        !           152: information is presented and manipulated within that application.  The @i(user)
        !           153: interface is the sum total of all application and management interfaces.
        !           154: 
        !           155: Besides applications, we distinguish three major components of a window system.
        !           156: The @i(window manager)@footnote( Some people use this term for what we call the
        !           157: base window system; that is not the meaning here.) implements the desktop
        !           158: portion of the management interface; it controls the size and placement of
        !           159: application windows, and also may control application window attributes such as
        !           160: titles and borders.  The @i(input manager) implements the remainder of the
        !           161: management interface; it controls which applications see input from which
        !           162: devices (e.g., keyboard and mouse).  The @i(base window system) is the
        !           163: substrate on which applications, window managers, and input managers are built.
        !           164: 
        !           165: In this paper we are concerned with the base window system of X, with the
        !           166: facilities it provides to build applications and managers.  The following
        !           167: requirements on the base window system crystallized during the design of X (a
        !           168: few were not formulated until late in the design process):
        !           169: 
        !           170: @begin(enumerate)
        !           171: 
        !           172: @begin(multiple)
        !           173: 
        !           174: The system should be implementable on a variety of displays.
        !           175: 
        !           176: The system should work with nearly any bitmap display, and a variety of input
        !           177: devices.  Our design focused on workstation-class display technology likely to
        !           178: be available in a university environment over the next few years.  At one end
        !           179: of the spectrum is a simple frame buffer and monochrome monitor, driven
        !           180: directly by the host CPU with no additional hardware support.  At the other end
        !           181: of the spectrum is a multi-plane display with color monitor, driven by a
        !           182: high-performance graphics co-processor.  Input devices such as keyboards, mice,
        !           183: tablets, joysticks, light pens, and touch screens should be supported.
        !           184: 
        !           185: @end(multiple)
        !           186: @begin(multiple)
        !           187: 
        !           188: Applications must be device independent.
        !           189: 
        !           190: There are several aspects to device independence.  Most importantly, it must
        !           191: not be necessary to rewrite, recompile, or even relink an application for each
        !           192: new hardware display.  Nearly as important, every graphics function defined by
        !           193: the system should work on virtually every supported display; the alternative,
        !           194: which is to use GKS-style inquire operations@cite(gks) to determine the set of
        !           195: implemented functions at run-time, leads to tedious case analysis in every
        !           196: application, and to inconsistent user interfaces.  A third aspect of device
        !           197: independence is that, as far as possible, applications should not need dual
        !           198: control paths to work on both monochrome and color displays.
        !           199: 
        !           200: @end(multiple)
        !           201: @begin(multiple)
        !           202: 
        !           203: The system must be network transparent:  an application running on one
        !           204: machine must be able to utilize a display on some other machine.  The two
        !           205: machines should not have to have the same architecture or operating system.
        !           206: 
        !           207: There are numerous examples of why this important:  a compute-intensive VLSI
        !           208: design program executing on a mainframe, but displaying results on a
        !           209: workstation; an application distributed over several stand-alone processors,
        !           210: but interacting with a user at a workstation; a professor running a program on
        !           211: one workstation, presenting results simultaneously on all student workstations.
        !           212: 
        !           213: In a network environment, there are certain to be applications that must run on
        !           214: particular machines or architectures.  Examples include proprietary software,
        !           215: applications depending on specific architectural properties, and programs
        !           216: manipulating large databases.  Such applications still should be accessible to
        !           217: all users.  In a truly heterogeneous environment, not all programming languages
        !           218: and programming systems are supported on all machines, and it is very
        !           219: undesirable to have to write an interactive front end in multiple languages in
        !           220: order to make the application generally available.  With network-transparent
        !           221: access, this is not necessary; a single front end written in the same language
        !           222: as the application suffices.
        !           223: 
        !           224: One might think that remote display will be extremely infrequent, and that
        !           225: performance therefore is much less important than for local display.
        !           226: Experience at MIT, however, indicates that many users routinely make use of the
        !           227: remote display capabilities in X, and that the performance of remote display is
        !           228: quite important.  The desktop display, although physically connected to a
        !           229: single computer, is used as a true @i(network virtual terminal); indeed, the
        !           230: idea of an X server (see the next section) built into a Blit-like
        !           231: terminal@cite(blit) is an intriguing one.
        !           232: 
        !           233: @end(multiple)
        !           234: @begin(multiple)
        !           235: 
        !           236: The system must support multiple applications displaying concurrently.
        !           237: 
        !           238: For example, it should be possible to display a clock with a sweep second hand
        !           239: in one window, while simultaneously editing a file in another window.
        !           240: 
        !           241: @end(multiple)
        !           242: @begin(multiple)
        !           243: 
        !           244: The system should be capable of supporting many different application and
        !           245: management interfaces.
        !           246: 
        !           247: No single user interface is "best"; different communities have radically
        !           248: different ideas about user interfaces.  Even within a single community,
        !           249: "experts" and "novices" place different demands on an interface.  Rather than
        !           250: mandating a particular user interface, the base window system should support a
        !           251: wide range of interfaces.
        !           252: 
        !           253: To achieve this, the system must provide @i(hooks) (mechanism) rather than
        !           254: @i(religion) (policy).  For example, since menu styles and semantics vary
        !           255: dramatically among different user interfaces, the base window system must
        !           256: provide primitives from which menus can be built, rather than just providing a
        !           257: fixed menu facility.
        !           258: 
        !           259: The system should be designed in such a way that it is possible to implement
        !           260: management policy both external to the base window system and external to
        !           261: applications.  Applications should be largely independent of management policy
        !           262: and mechanism; applications should @i(react to) management decisions, rather
        !           263: than @i(directing) those decisions.  For example, an application needs to be
        !           264: informed when one of its windows is resized, and should react by reformatting
        !           265: the information displayed, but involvement of the application should not be
        !           266: required in order for the user to change the size.  Making applications
        !           267: management-independent, as well as device-independent, facilitates the sharing
        !           268: of applications between diverse cultures.
        !           269: 
        !           270: @end(multiple)
        !           271: @begin(multiple)
        !           272: 
        !           273: The system must support overlapping windows, including output to partially
        !           274: obscured windows.
        !           275: 
        !           276: This is in some sense a by-product of the previous requirement, but is
        !           277: important enough to merit explicit statement.  Not all user interfaces allow
        !           278: windows to overlap arbitrarily.  However, even interfaces that do not allow
        !           279: application windows to overlap typically provide some form of pop-up menu that
        !           280: overlaps application windows.  If such menus are built from windows, then
        !           281: support for overlapping windows must exist.
        !           282: 
        !           283: @end(multiple)
        !           284: @begin(multiple)
        !           285: 
        !           286: The system should support a hierarchy of resizable windows, and an application
        !           287: should be able to use many windows at once.
        !           288: 
        !           289: Subwindows provide a clean, powerful mechanism for exporting much of the basic
        !           290: system machinery back to the application for direct use.  Many applications
        !           291: make use of their own window-like abstractions; some even implement what is
        !           292: essentially another window system, nested within the "real" window system.  It
        !           293: is important to support arbitrary levels of nesting.  What is viewed as a
        !           294: single window at one abstraction level may well require multiple subwindows at
        !           295: a lower level.  By providing a true window hierarchy, application windows can
        !           296: be implemented as true windows within the system, freeing the application from
        !           297: duplicating machinery such as clipping and input control.
        !           298: 
        !           299: @end(multiple)
        !           300: @begin(multiple)
        !           301: 
        !           302: The system should provide high-performance, high-quality support for text,
        !           303: 2-D synthetic graphics, and imaging.
        !           304: 
        !           305: The base window system must provide "immediate" or "transparent" graphics:  the
        !           306: application describes the image precisely, and the system does not attempt to
        !           307: second-guess the application.  The use of high-level models, whereby the
        !           308: application describes @i(what) it wants in terms of fairly abstract objects and
        !           309: the system determines @i(how) best to render the image, cannot be imposed as
        !           310: the only form of graphics interface.  Such models generally fail to provide
        !           311: adequate support for some important class of applications, and different user
        !           312: communities tend to have strong opinions about which model is "best".
        !           313: High-level models are extremely important to provide, but they should be built
        !           314: in layers on top of the base window system.
        !           315: 
        !           316: Support for 3-D graphics is not listed as a requirement, but this is not to say
        !           317: it is unimportant.  We simply have not considered 3-D graphics, due to lack of
        !           318: expertise and lack of time.
        !           319: 
        !           320: @end(multiple)
        !           321: @begin(multiple)
        !           322: The system should be extensible.
        !           323: 
        !           324: For example, the core system may not support 3-D graphics, but it should be
        !           325: possible to extend the system with such support.  The extension mechanism
        !           326: should allow communities to extend the system non-cooperatively, yet allow such
        !           327: independent extensions to be merged gracefully.
        !           328: 
        !           329: @end(multiple)
        !           330: @end(enumerate)
        !           331: 
        !           332: We believe that a window system must satisfy these requirements to be a viable
        !           333: standard in an environment of high-performance workstations and mainframes
        !           334: connected via high-performance local area networks.  X satisfies most of these
        !           335: requirements, but currently fails to satisfy a few due to practical
        !           336: considerations of staffing and time constraints:  the design and much of the
        !           337: implementation of the base window system was to be handled solely by the first
        !           338: author; it was important to get a working system up fairly quickly; and the
        !           339: immediate applications only required relatively simple text and graphics
        !           340: support.  As a result, X is not designed to handle high-end color displays or
        !           341: to deal with input devices other than a keyboard and mouse; some support for
        !           342: high-quality text and graphics is missing; X only provides support for one
        !           343: class of management policy; and no provision has been made for extensions.  As
        !           344: discussed in Section 10, these and other problems are being addressed in a
        !           345: redesign of X.
        !           346: 
        !           347: @begin(fullpagefigure)
        !           348: @blankspace(7 inches)
        !           349: @caption(System Structure)
        !           350: @end(fullpagefigure)
        !           351: 
        !           352: @section(System Model)
        !           353: 
        !           354: The X window system is based on a client-server model; this model follows
        !           355: naturally from requirements two and three in the previous section.  For each
        !           356: physical display, there is a controlling server.  A client application and a
        !           357: server communicate over a reliable duplex (8-bit) byte stream.  A simple block
        !           358: stream protocol is layered on top of the byte stream.  If the client and server
        !           359: are on the same machine, the stream is typically based on a local inter-process
        !           360: communication (IPC) mechanism, and otherwise a network connection is
        !           361: established between the pair.  Requiring nothing more than a reliable duplex
        !           362: byte stream (without urgent data) for communication makes X usable in many
        !           363: environments.  For example, the X protocol can be used over TCP@cite(tcp),
        !           364: DECnet@cite(decnet), and Chaos@cite(chaos).
        !           365: 
        !           366: Multiple clients can have connections open to a server simultaneously, and a
        !           367: client can have connections open to multiple servers simultaneously.  The
        !           368: essential tasks of the server are to multiplex requests from clients to the
        !           369: display, and demultiplex keyboard and mouse input back to the appropriate
        !           370: clients.  Typically, the server is implemented as a single sequential process,
        !           371: using round-robin scheduling among the clients, and this centralized control
        !           372: trivially solves many synchronization problems; however, a multi-process server
        !           373: has also been implemented.  Although one might place the server in the kernel
        !           374: of the operating system in an attempt to increase performance, a user-level
        !           375: server process is vastly easier to debug and maintain, and performance under
        !           376: Unix in fact does not seem to suffer.  Similar performance results have been
        !           377: obtained in Andrew@cite(wm).  Various tricks are used in both clients and
        !           378: server to optimize performance, principally by minimizing the number of
        !           379: operating system calls@cite(hacks).
        !           380: 
        !           381: The server encapsulates the base window system.  It provides the fundamental
        !           382: resources and mechanisms, and the hooks required to implement various user
        !           383: interfaces.  All device dependencies are encapsulated by the server; the
        !           384: communication protocol between clients and the server is device independent.
        !           385: By placing all device dependencies on one end of a network connection,
        !           386: applications are truly device independent.  The addition of a new display type
        !           387: simply requires the addition of a new server implementation; no application
        !           388: changes are required.  Of course, the server itself is designed as device
        !           389: independent code layered on top of a device dependent core, so only the "back
        !           390: end" of the server need be reimplemented for each new display.@footnote( A back
        !           391: end has been implemented using a programming interface to X itself, such that a
        !           392: complete "recursive" X server executes inside a window of another X server.)
        !           393: 
        !           394: @subsection(Network Considerations)
        !           395: 
        !           396: It is extremely important for the server to be robust with respect to client
        !           397: failures.  The server, and the network protocol, must be designed so that the
        !           398: server never trusts clients to provide correct data.  As a corollary, the
        !           399: protocol must be designed in such a way that, if the server ever has to wait
        !           400: for a response from a client, it must be possible to continue servicing other
        !           401: clients.  Without this property, a buggy client or a network failure could
        !           402: easily cause the entire display to freeze up.
        !           403: 
        !           404: Byte ordering is a standard problem in network communication:  when a 16-bit or
        !           405: 32-bit quantity is transmitted over an 8-bit byte stream, is the most
        !           406: significant byte transmitted first (big-endian byte order) or is the least
        !           407: significant byte transmitted first (little-endian byte order)?  Some machines
        !           408: with byte-addressable memory use big-endian order internally, and others use
        !           409: little-endian order.  If a single order is chosen for network communication,
        !           410: some machines will suffer the overhead of swapping bytes, even when
        !           411: communicating with a machine using the same internal byte order.  Such an
        !           412: approach also means that both parties in the communication must worry about
        !           413: byte order.
        !           414: 
        !           415: The X protocol uses a different approach.  The server is designed to accept
        !           416: both big-endian and little-endian connections.  For example, using TCP this is
        !           417: accomplished by having the server listen on two distinct ports; little-endian
        !           418: clients connect to the server on one port, and big-endian clients connect on
        !           419: the other.  Clients always transmit and receive in their native byte order.
        !           420: The server alone is responsible for byte swapping, and byte swapping only
        !           421: occurs between dissimilar architectures.  This eliminates the byte swapping
        !           422: overhead in the most common situations, and greatly simplifies the building of
        !           423: client-side interface libraries in various programming languages.  X is not
        !           424: unique in its use of this trick; the current VGTS implementation uses the same
        !           425: trick, and similar protocol optimizations have been used in various
        !           426: network-based applications.
        !           427: 
        !           428: Another potential problem in protocol design is word alignment.  In particular,
        !           429: some architectures require 16-bit quantities to be aligned on 16-bit boundaries
        !           430: and 32-bit quantities to be aligned on 32-bit boundaries in memory.  To allow
        !           431: efficient implementations of the protocol across a spectrum of 16-bit and
        !           432: 32-bit architectures, the protocol is defined to consist of blocks that are
        !           433: always multiples of 32 bits, and each 16-bit and 32-bit quantity within a block
        !           434: is aligned on 16-bit and 32-bit boundaries, respectively.
        !           435: 
        !           436: X is designed to operate in an environment where the inter-process
        !           437: communication round-trip time is between 5 and 50 milliseconds, both for local
        !           438: and for network communication.  We also assume that data transmission rates are
        !           439: comparable to display rates; for example, to transmit and display 5000
        !           440: characters per second, a data rate of approximately 50Kb (kilobits per second)
        !           441: will be needed, and to transmit and display 20000 characters per second, a data
        !           442: rate of approximately 200Kb will be needed.  Networks and protocol
        !           443: implementations with these characteristics are now quite commonplace.  For
        !           444: example, workstations running Berkeley Unix, connected via 10Mb (megabits per
        !           445: second) local area networks, typically have round-trip times of 15 to 30
        !           446: milliseconds, and data rates of 500Kb to 1Mb.
        !           447: 
        !           448: The round-trip time is important in determining the form of the communication
        !           449: protocol.  The most common communication will be text and graphics requests
        !           450: sent from a client to the server.  Examples of individual requests might be to
        !           451: draw a string of text or to draw a line.  Such requests could be sent either
        !           452: synchronously, in which case the client sends a request only after receiving a
        !           453: reply from the server to the previous request, or they could be sent
        !           454: asynchronously, without the server generating any replies.  However, since the
        !           455: requests are sent over a reliable stream, they are guaranteed to arrive, and
        !           456: arrive in order, so replies from the server to graphics requests serve no
        !           457: useful purpose.  Moreover, with round-trip times over 5 milliseconds, output to
        !           458: the display must be asynchronous, or it will be impossible to drive high-speed
        !           459: displays adequately.  For example, at 80 characters per request and a 25
        !           460: millisecond round-trip time, only 3200 characters per second can be drawn
        !           461: synchronously, whereas many hardware devices are capable of displaying between
        !           462: 5000 and 30000 characters per second.
        !           463: 
        !           464: Similarly, polling the server for keyboard and mouse input would be
        !           465: unacceptable in many applications, particularly those written in sequential
        !           466: languages.  For example, an application attempting to provide real-time
        !           467: response to input has to poll periodically for input during screen updates.
        !           468: For an application with a single thread of control, this effectively results in
        !           469: synchronous output, and consequent performance loss.  Hence, input must be
        !           470: generated asynchronously by the server, so that applications need at most
        !           471: perform local polling.
        !           472: 
        !           473: The round-trip time is also important in determining what user interfaces can
        !           474: be supported without embedding them directly in the server.  The most important
        !           475: concern is whether remote, application-level mouse tracking is feasible.  By
        !           476: @i(tracking), we do not mean maintaining the cursor image on the screen as the
        !           477: user moves the mouse; that function is performed autonomously by the X server,
        !           478: often directly in hardware.  Rather, applications track the mouse by animating
        !           479: some other image on the screen in real time as the mouse moves.  For round-trip
        !           480: times under 50 milliseconds, tracking is perfectly reasonable, driven either by
        !           481: motion events generated by the server or by continuous polling from the
        !           482: application.  With a refresh occurring up to 30 times every second, remote
        !           483: tracking is demonstrably "instantaneous" with mouse motion.
        !           484: 
        !           485: For tracking to be effective, however, relatively little time can be spent
        !           486: updating the display at each movement, so typically only relatively small
        !           487: changes can be made to the screen while tracking.  This is certainly the case
        !           488: for common operations, such as rubber banding window outlines and highlighting
        !           489: menu items.  It might be argued that the ability to run application-specific
        !           490: code in the server is required for acceptable hand-eye coordination during
        !           491: complex tracking.  For example, NeWS@cite(news) provides such a mechanism in a
        !           492: novel way.  However, we are not convinced there are sufficient benefits to
        !           493: justify such complexity.  Complex tracking typically is bound up intimately
        !           494: with application-specific data structures and knowledge representations, and
        !           495: such information is used by the "back end" of the application as well as the
        !           496: "front end".  In a distributed system it is folly to believe that applications
        !           497: will download large front ends into a server; communication round-trip times
        !           498: are a reality that cannot be escaped.
        !           499: 
        !           500: @subsection(Resources)
        !           501: 
        !           502: The basic resources provided by the server are windows, fonts, mouse cursors,
        !           503: and off-screen images; later sections describe each of these.  Clients request
        !           504: creation of a resource by supplying appropriate parameters (such as the name of
        !           505: the font); the server allocates the resource and returns a 31-bit unique
        !           506: identifier used to represent it.  The use and interpretation of a resource
        !           507: identifier is independent of any network connection.  Any client that knows (or
        !           508: guesses) the identifier for a resource can use and manipulate the resource
        !           509: freely, even if it was created by another client.  This capability is required
        !           510: to allow window managers to be written independently of applications, and to
        !           511: allow multi-process applications to manipulate shared resources.  However, to
        !           512: avoid problems associated with clients that fail to clean up their resources at
        !           513: termination (which is all too common in operating systems where users can
        !           514: unilaterally abort processes), the maximum lifetime of a resource is always
        !           515: tied to the connection over which it was created.  Thus, when a client
        !           516: terminates, all of the resources it created are destroyed automatically.
        !           517: 
        !           518: Access control is performed only when a client attempts to establish a
        !           519: connection to the server; once the connection is established the client can
        !           520: freely manipulate any resource.  Since accidental manipulation of some other
        !           521: client's resource is extremely unlikely (both in theory and in practice), we
        !           522: believe introducing access control on a per-resource basis would only serve to
        !           523: decrease performance, not to significantly increase security or robustness.
        !           524: The current access control mechanism is based simply on host network addresses,
        !           525: as this information is provided by most network stream protocols, and there
        !           526: seems to be no widely used or even widely available user-level authentication
        !           527: mechanism.  Host-based access control has proven to be marginally acceptable in
        !           528: a workstation environment, but is rather unacceptable for time-shared
        !           529: machines.@footnote( It is interesting that @i(professors) at MIT have argued
        !           530: vociferously to disable all access control.)
        !           531: 
        !           532: Each client-generated protocol request is a simple data block consisting of an
        !           533: opcode, some number of fixed-length parameters, and possibly a variable-length
        !           534: parameter.  For example, to display text in a window, the fixed-length
        !           535: parameters include the drawing color and the identifiers for the window and the
        !           536: font, and the variable-length parameter is the string of characters.  All
        !           537: operations on a resource explicitly contain the identifier of the resource as a
        !           538: parameter.  In this way, an application can multiplex use of many windows over
        !           539: a single network connection.  This multiplexing makes it easy for the client to
        !           540: control the time-order of updates to multiple windows.  Similarly, each input
        !           541: event generated by the server contains the identifier of the window in which
        !           542: the event occurred.  Multiplexing over a single stream allows the client to act
        !           543: on events from multiple windows in correct time order; timestamps alone are
        !           544: inadequate without strong guarantees from the stream mechanism.
        !           545: 
        !           546: Numerous Unix-based window
        !           547: systems@cite(masscomp,andrew,sapphire,pnx,sunwin,mg1,metheus) use file or
        !           548: channel descriptors to represent windows; window creation involves an
        !           549: interaction with the operating system, which results in the creation of such a
        !           550: descriptor.  Typically, this means the window cannot be named (and hence cannot
        !           551: be shared) by programs running on different machines, and perhaps not even by
        !           552: programs running on the same machine.  More serious, there is often a severe
        !           553: restriction on the number of active descriptors a process may have:  20 on
        !           554: older systems and usually 64 on newer systems.  The use of 50 or more windows
        !           555: (albeit nested inside a single top-level window) is quite common in X
        !           556: applications.  The use of a single connection, over which an arbitrary number
        !           557: of windows can be multiplexed, is clearly a better approach.
        !           558: 
        !           559: @section(Window Hierarchy)
        !           560: 
        !           561: The server supports an arbitrarily branching hierarchy of rectangular windows.
        !           562: At the top is the @i(root) window, which covers the entire screen.  The
        !           563: @i(top-level) windows of applications are created as subwindows of the root
        !           564: window.  The window hierarchy models the now-familiar "stacks of papers"
        !           565: desktop.  For a given window, its subwindows can be stacked in any order, with
        !           566: arbitrary overlaps.  When window W1 partially or completely covers window W2,
        !           567: we say that W1 @i(obscures) W2.  This relationship is not restricted to
        !           568: siblings; if W1 obscures W2, then W1 may also obscure subwindows of W2.  A
        !           569: window also obscures its parent.  Window hierarchies never interleave; if
        !           570: window W1 obscures sibling window W2, then subwindows of W2 never obscure W1 or
        !           571: subwindows of W1.  A window is not restricted in size or placement by the
        !           572: boundaries of its parent, but a window is always visibly clipped by its parent:
        !           573: portions of the window that extend outside the boundaries of the parent are
        !           574: never displayed, and do not obscure other windows.  Finally, a window can be
        !           575: either @i(mapped) or @i(unmapped).  An unmapped window is never visible on the
        !           576: screen; a mapped window can only be visible if all of its ancestors are also
        !           577: mapped.
        !           578: 
        !           579: Output to a leaf window (one with no subwindows) is always clipped to the
        !           580: visible portions of the window; drawing on such a window never draws into
        !           581: obscuring windows.  Output to a window that contains subwindows can be
        !           582: performed in two modes.  In @i(clipped) mode the output is clipped normally by
        !           583: all obscuring windows (including subwindows), but in @i(draw-through) mode the
        !           584: output is not clipped by subwindows.  For example, draw-through mode is used on
        !           585: the root window during window management, tracking the mouse with the outline
        !           586: of a window to indicate how the window is to be moved or resized.  If clipped
        !           587: mode were used instead, the entire outline would not be visible.
        !           588: 
        !           589: The coordinate system is defined with the X axis horizontal and the Y axis
        !           590: vertical.  Each window has its own coordinate system, with the origin at the
        !           591: upper left corner of the window.  Having per-window coordinate systems is
        !           592: crucial, particularly for top-level windows; applications are almost always
        !           593: designed to be insensitive to their position on the screen, and having to worry
        !           594: about race conditions when moving windows would be a disaster.  The coordinate
        !           595: system is discrete: each pixel in the window corresponds to a single unit in
        !           596: the coordinate system, with coordinates centered on the pixels, and all
        !           597: coordinates are expressed as integers in the protocol.  We believe fractional
        !           598: coordinates are not required at the protocol level for the raster graphics
        !           599: provided in X (see section 6), although they may be required for high-end color
        !           600: graphics, such as anti-aliasing.  The aspect ratio of the screen is not masked
        !           601: by the protocol, since we believe that most displays have a one to one aspect
        !           602: ratio; in this regard X is arguably device dependent.
        !           603: 
        !           604: Although the coordinate system is discrete at the protocol level, continuous or
        !           605: alternate-origin coordinate systems certainly can be used at the application
        !           606: level, but client-side libraries must eventually translate to the discrete
        !           607: coordinates defined by the protocol.  In this way, we can ignore the many
        !           608: variations in floating-point (or even fixed-point) formats among architectures.
        !           609: Further, the coordinates can be expressed in the protocol as 16-bit quantities,
        !           610: which can be manipulated efficiently in virtually every machine/display
        !           611: architecture, and which minimizes the number of data bytes transmitted over the
        !           612: network.  The use of 16-bit quantities does have a drawback, in that some
        !           613: applications (particularly CAD tools) like to perform zoom operations simply by
        !           614: scaling coordinates and redrawing, relying on the window system to clip
        !           615: appropriately.  Since scaling quickly overflows 16 bits, additional clipping
        !           616: must be performed explicitly by such applications.
        !           617: 
        !           618: A window can optionally have a @i(border), a shaded outer frame maintained
        !           619: explicitly by the X server.  The origin of the window's coordinate system is
        !           620: inside the border, and output to the window is clipped automatically so as not
        !           621: to extend into the border.  The presence of borders slightly complicates the
        !           622: semantics of the window system; for simplicity we will ignore them in the
        !           623: remainder of this paper.
        !           624: 
        !           625: The basic operations on window structure are straightforward.  An unmapped
        !           626: window is created by specifying the parent window, the position within the
        !           627: parent of the upper left corner of the new window, and the width and height (in
        !           628: coordinate units) of the new window.  A window can be destroyed, in which case
        !           629: all windows below it in the hierarchy are also destroyed.  A window can be
        !           630: mapped and unmapped, without changing its position.  A window can be moved and
        !           631: resized, including being moved and resized simultaneously.  A window can also
        !           632: be "depthwise" raised to the top or lowered to the bottom the stack with
        !           633: respect to its siblings, without changing its coordinate position.  Currently
        !           634: mapping or configuring a window forces the window to be raised.  This
        !           635: restriction appeared to simplify the server implementation, but also happened
        !           636: to match the basic management interface we expected to build.  This restriction
        !           637: will be eliminated in the next version.
        !           638: 
        !           639: The windows described above are the usual @i(opaque) windows.  X also provides
        !           640: @i(transparent) windows.  A transparent window is always invisible on the
        !           641: screen, and does not obscure output to, or visibility of, other windows.
        !           642: Output to a transparent window is clipped to that window, but is actually drawn
        !           643: on the parent window.  Thus, for output, a transparent window is simply a
        !           644: clipping rectangle that can be applied to restrict output within a (parent)
        !           645: window.  Input processing for transparent and opaque windows is identical, as
        !           646: described in Section 8.  In Section 10 we will argue that most uses of
        !           647: transparent windows are better satisfied with other mechanisms.  Therefore, for
        !           648: simplicity, we will ignore transparent windows in the rest of this paper.
        !           649: 
        !           650: The X server is designed explicitly to make windows inexpensive.  Our goal was
        !           651: to make it reasonable to use windows for such things as individual menu items,
        !           652: buttons, even individual items in forms and spreadsheets.  As such, the server
        !           653: must deal efficiently with hundreds (though not necessarily thousands) of
        !           654: windows on the screen simultaneously.  Experience with X has shown that many
        !           655: implementors find this capability extremely useful.
        !           656: 
        !           657: @section(Color)
        !           658: 
        !           659: The screen is viewed as two dimensional, with an N-bit @i(pixel) value stored
        !           660: at each coordinate.  The number of bits in a pixel value, and how a value
        !           661: translates into a color, depends on the hardware.  X is designed to support two
        !           662: types of hardware:  monochrome and pseudo-color.  A monochrome display has one
        !           663: bit per pixel, and the two values translate into black and white.  Pseudo-color
        !           664: displays typically have between four and twelve bits per pixel; the pixel value
        !           665: is used as an index into a color map, yielding red, green, and blue
        !           666: intensities.  The color map can be changed dynamically, so that a given pixel
        !           667: value can represent different colors over time.  Gray-scale is viewed as a
        !           668: degenerate case of pseudo-color.
        !           669: 
        !           670: We desire a design matching most display hardware, while abstracting
        !           671: differences in such a way that programmers do not have to double or triple-code
        !           672: their applications to cover the spectrum.  We also want multiple applications
        !           673: to coexist within a single color map, so that applications always show true
        !           674: color on the screen.  To allow this, and to keep applications device
        !           675: independent, pixel values should not be coded explicitly into applications.
        !           676: Instead, the server must be responsible for managing the color map, and color
        !           677: map allocation must be expressed in hardware-independent terms.
        !           678: 
        !           679: All graphics operations in X are expressed in terms of pixel values.  For
        !           680: example, to draw a line, one specifies not only the coordinates of the
        !           681: end-points but the pixel value with which to draw the line.  (Logic functions
        !           682: and plane-select masks are also specified, as described in Section 6.)  On a
        !           683: monochrome display, the only two pixel values are zero and one, which are
        !           684: (somewhat arbitrarily) defined to be black and white, respectively.  On a
        !           685: pseudo-color display, pixel values zero and one are pre-allocated by the
        !           686: server, for use as "black" and "white", so that monochrome applications display
        !           687: correctly on color displays.  Of course, the actual colors need not be black
        !           688: and white, but can be set by the user.
        !           689: 
        !           690: There are two ways for a client to obtain pixel values.  In the simplest
        !           691: request, the client specifies red, green, and blue color values, and the server
        !           692: allocates an arbitrary pixel value and sets the color map so the pixel value
        !           693: represents the closest color the hardware can provide.  The color map entry for
        !           694: this pixel value cannot be changed by the client, so if some other client
        !           695: requests an equivalent color, the server is free to respond with the same pixel
        !           696: value.  Such sharing is important in maximizing use of the color map.  To
        !           697: isolate applications from variations in color representation among displays
        !           698: (due, for example, to the standard of illumination used for calibration), the
        !           699: server provides a color database which clients can use to translate string
        !           700: names of colors into red, green, and blue values tailored for the particular
        !           701: display.
        !           702: 
        !           703: The second request allocates writable map entries.  This mechanism was designed
        !           704: explicitly for X; we are not aware of a comparable mechanism in any other
        !           705: window system.  The client specifies two numbers, @i(C) and @i(P), with @i(C)
        !           706: positive and @i(P) non-negative; the request can be expressed as "allocate
        !           707: @i(C) colors and @i(P) planes".  The total number of pixel values allocated by
        !           708: the server is @i(C*2@+(P)).  The values passed back to the client consist of
        !           709: @i(C) base pixel values, and a plane mask containing @i(P) bits.  None of the
        !           710: base pixel values have any one bits in common with the plane mask, and the
        !           711: complete set of allocated pixel values is obtained by combining all possible
        !           712: combinations of one bits from the plane mask with each of the base pixel
        !           713: values.  The client can optionally require the @i(P) planes to be contiguous,
        !           714: in which case all @i(P) bits in the plane mask will be contiguous.
        !           715: 
        !           716: There are three common uses of this second request.  One is simply to allocate
        !           717: a number of "unrelated" pixel values; in this case, @i(P) will be zero.  A
        !           718: second use is in imaging applications, where it is convenient to be able to
        !           719: perform simple arithmetic on pixel values.  In this case, a contiguous block of
        !           720: pixel values is allocated by setting @i(C) to one and @i(P) to the log (base 2)
        !           721: of the number of pixel values required, and requesting contiguous allocation.
        !           722: Arithmetic on the pixel values then requires at most some additional shift and
        !           723: mask operations.
        !           724: 
        !           725: A third form of allocation arises in applications that want some form of
        !           726: overlay graphics, such as highlighting or outlining regions.  Here the
        !           727: requirement is to be able to draw and then erase graphics without disturbing
        !           728: existing window contents.  For example, suppose an application typically uses
        !           729: four colors, but needs to be able to overlay a rectangle outline in a fifth
        !           730: color.  An allocation request with C set to four and P set to one results in
        !           731: two groups of four pixel values.  The four base pixel values are assigned the
        !           732: four normal colors, and the four alternate pixel values are all assigned the
        !           733: fifth color.  Overlay graphics can then be drawn by restricting output (see the
        !           734: next section) to the single bit plane specified in the mask returned by the
        !           735: color allocation.  Turning bits in this plane on (to ones) changes the image to
        !           736: the fifth color, and turning them off reverts the image to its original color.
        !           737: 
        !           738: @section(Graphics and Text)
        !           739: 
        !           740: Graphics operations are often the most complex part of any window system,
        !           741: simply because so many different effects and variations are required to satisfy
        !           742: a wide range of applications.  In this section we sketch the operations
        !           743: provided in X, so that the basic level of graphics support can be understood.
        !           744: The operations are essentially a subset of the Digital Workstation Graphics
        !           745: Architecture; the VS100 display@cite(vs100) implements this architecture for
        !           746: 1-bit pixel values.  The set of operations purposely was kept simple, in order
        !           747: to maximize portability.
        !           748: 
        !           749: Graphics operations in X are expressed in terms of relatively high-level
        !           750: concepts, such as lines, rectangles, curves, and fonts.  This is in contrast to
        !           751: systems in which the basic primitives are to read and write individual pixels.
        !           752: Basing applications on pixel-level primitives works well when display memory
        !           753: can be mapped into the application's address space for direct manipulation.
        !           754: However, both display hardware and operating systems exist for which such
        !           755: direct access is not possible, and emulating pixel-level manipulations in such
        !           756: an environment results in extremely poor performance.  Expressing operations at
        !           757: a higher level avoids such device dependencies, and also avoids potential
        !           758: problems with network bandwidth.  With high-level operations, a protocol
        !           759: request transmitted as a small number of bits over the network typically
        !           760: affects ten to one hundred times as many pixels on the screen.
        !           761: 
        !           762: @subsection(Images)
        !           763: 
        !           764: Two forms of off-screen images are supported in X:  bitmaps and pixmaps.  A
        !           765: bitmap is a single plane (bit) rectangle.  A pixmap is an N-plane (pixel)
        !           766: rectangle, where @i(N) is the number of bits per pixel used by the particular
        !           767: display.  A bitmap or pixmap can be created by transmitting all of the bits to
        !           768: the server; a pixmap can also be created by copying a rectangular region of a
        !           769: window.  Bitmaps and pixmaps of arbitrary size can be created.  Transmitting
        !           770: very large (or deep) images over a network connection can be quite slow;
        !           771: however, the ability to make use of shared memory in conjunction with the IPC
        !           772: mechanism would help enormously when the client and server are on the same
        !           773: machine.
        !           774: 
        !           775: The primary use of bitmaps is as masks (clipping regions).  Several graphics
        !           776: requests allow a bitmap to be used as a clipping region@cite(warnock).  Bitmaps
        !           777: are also used to construct cursors, as described in Section 8.  Pixmaps are
        !           778: used to store frequently drawn images, and as temporary backing-store for
        !           779: pop-up menus (as described in Section 8).  However, the principal use of
        !           780: pixmaps is as tiles, that is, as patterns which are replicated in two
        !           781: dimensions to cover a region.  Since there are often hardware restrictions as
        !           782: to what tile shapes can be replicated efficiently, guaranteed shapes are not
        !           783: defined by the X protocol.  An application can query the server to determine
        !           784: what shapes are supported, although to date most applications simply assume 16
        !           785: by 16 tiles are supported.  A better semantics is to support arbitrary shapes,
        !           786: but allow applications to query as to which shapes are most efficient.
        !           787: 
        !           788: The tiling origin used in X is almost always the origin of the destination
        !           789: window.  That is, if enough tiles were laid out, one tile would have its upper
        !           790: left corner at the upper left corner of the window.  In this way, the contents
        !           791: of the window are independent of the window's position on the screen, and the
        !           792: window can be moved transparently to the application.
        !           793: 
        !           794: Servers vary widely in the amount of off-screen memory provided.  For example,
        !           795: some servers limit off-screen memory to that accessible directly to the
        !           796: graphics processor (typically one to three times the size of screen memory),
        !           797: and fonts and other resources are allocated from this same pool.  Other servers
        !           798: utilize their entire virtual address space for off-screen memory.  Since
        !           799: off-screen memory for images is finite, an explicit part of the X protocol is
        !           800: the possibility that bitmap or pixmap creation can fail.  Depending on the
        !           801: intended use of the image, the application may or may not be able to cope with
        !           802: the failure.  For example, if the image was being stored simply to speed up
        !           803: redisplay, the application can always transmit the image directly each time
        !           804: (see below).  If the image was to be a temporary backing-store for a window,
        !           805: the application can fall back on normal exposure processing (as described in
        !           806: Section 7).  Servers should be constructed in such a way as to virtually
        !           807: guarantee sufficient memory (e.g., by caching images) for creating at least
        !           808: small tiles and cursors, although this is not true in current implementations.
        !           809: 
        !           810: @subsection(Graphics)
        !           811: 
        !           812: All graphics and text requests include a logic function and a plane-select mask
        !           813: (an integer with the same number of bits as a pixel value) to modify the
        !           814: operation.  All sixteen logic functions are provided.  Given a source and
        !           815: destination pixel, the function is computed bitwise on corresponding bits of
        !           816: the pixels, but only on bits specified in the plane-select mask.  Thus the
        !           817: result pixel is computed as
        !           818: @begin(format, leftmargin +5)
        !           819: ((source FUNC destination) AND mask) OR (destination AND (NOT mask))
        !           820: @end(format)
        !           821: The most common operation is simply replacing the destination with the source in
        !           822: all planes.
        !           823: 
        !           824: The simplest graphics request takes a single source pixel value and combines it
        !           825: with every pixel in a rectangular region of a window.  Typically this is used
        !           826: to fill a region with a color, but by varying the logic function or masks,
        !           827: other effects can be achieved.  A second request takes a tile, effectively
        !           828: constructs a tiled rectangular source with it, and then combines the source
        !           829: with a rectangular region of a window.
        !           830: 
        !           831: An arbitrary image can be displayed directly, without first being stored
        !           832: off-screen.  For monochrome images, the full contents of a bitmap are
        !           833: transmitted, along with a pair of pixel values; the image is displayed in a
        !           834: region of a window with those two colors.  For color images, the full contents
        !           835: of a pixmap can be transmitted and displayed.  In order to avoid inordinate
        !           836: buffer space in the server, very large images must be broken into sections on
        !           837: the client side and displayed in separate requests.
        !           838: 
        !           839: The CopyArea request allows one region of a window to be moved to (or combined
        !           840: with) another region of the same window.  This is the usual @i(bitblt), or "bit
        !           841: block transfer" operation.  The source and destination are given as rectangular
        !           842: regions of the window; the two regions have the same dimensions.  The operation
        !           843: is such that overlap of the source and destination does not affect the result.
        !           844: 
        !           845: X provides a complex primitive for line drawing.  It provides for arbitrary
        !           846: combinations of straight and curved segments, defining both open and closed
        !           847: shapes.  Lines can be @i(solid), by drawing with a single source pixel value,
        !           848: @i(dashed), by alternately drawing with a single source pixel value and not
        !           849: drawing, and @i(patterned), by alternately drawing with two source pixel
        !           850: values.  Lines are drawn with a rectangular brush.  Clients can query the
        !           851: server to determine what brush shapes are supported; a better semantics would
        !           852: be to support arbitrary shapes, but allow applications to query as to which
        !           853: shapes are most efficient.
        !           854: 
        !           855: A final request allows an arbitrary closed shape (such as could be specified in
        !           856: the line drawing request) to be filled with either a single source pixel value
        !           857: or a tile.  For self-intersecting shapes, the even-odd rule is used: a point is
        !           858: inside the shape if an infinite ray with the point as origin crosses the path
        !           859: an odd number of times.
        !           860: 
        !           861: @subsection(Text)
        !           862: 
        !           863: For high-performance text, X provides direct support for bitmap fonts.  A font
        !           864: consists of up to 256 bitmaps; each bitmap in a font has the same height but
        !           865: can vary in width.  To allow server-specific font representations, clients
        !           866: "create" fonts by specifying a name rather than by downloading bitmap images
        !           867: into the server.  An application can use an arbitrary number of fonts, but (as
        !           868: with all resources) font allocation can fail for lack of memory.  A reasonably
        !           869: implemented server should support an essentially unbounded number of fonts
        !           870: (e.g., by caching), but some existing server implementations are deficient in
        !           871: this respect.  Unlike Andrew@cite(wm), no heuristics are applied by the server
        !           872: when resolving a name to a font; specific communities or applications may
        !           873: demand a variety of heuristics, and as such they belong outside the base window
        !           874: system.  Also unlike Andrew, the X server is not free to dynamically substitute
        !           875: one font for another; we do not believe such behavior is necessary or
        !           876: appropriate.
        !           877: 
        !           878: A string of text can be displayed using a font either as a mask or as a source.
        !           879: Using a font as a mask, the foreground (the one bits in the bitmap) of each
        !           880: character is drawn with a single source pixel value.  Using a font as a source,
        !           881: the entire image of each character is drawn, using a pair of pixel values.
        !           882: Source font output is provided specifically for applications using fixed-width
        !           883: fonts in emulating traditional terminals.
        !           884: 
        !           885: To support "cut and paste" operations between applications, the server provides
        !           886: a number of buffers into which a client can read and write an arbitrary string
        !           887: of bytes.  (This mechanism was adopted from Andrew.)  Although these buffers
        !           888: are used principally for text strings, the server imposes no interpretation on
        !           889: the data, so cooperating applications can use the buffers to exchange such
        !           890: things as resource identifiers and images.
        !           891: 
        !           892: @section(Exposures)
        !           893: 
        !           894: Given that output to obscured windows is possible, the issue of @i(exposure)
        !           895: must be addressed.  When all (or a piece) of an obscured window again becomes
        !           896: visible (for example, as the result of the window being raised), is the client
        !           897: or the server responsible for restoring the contents of the window?  In X, it
        !           898: is the responsibility of the client.  When a region of a window becomes
        !           899: exposed, the server sends an asynchronous event to the client, specifying the
        !           900: window and the region that has been exposed; the rest is up to the application.
        !           901: A trivial application might simply redraw the entire window; a more
        !           902: sophisticated application would only redraw the exposed region.
        !           903: 
        !           904: Why is the client responsible?  Because X imposes no structure on, or
        !           905: relationships between, graphics operations from a client, there are only two
        !           906: basic mechanisms by which the server might restore window contents:  by
        !           907: maintaining display lists, and by maintaining off-screen images.  In the first
        !           908: approach, the server essentially retains a list of all output requests
        !           909: performed on the window.  When a region of the window becomes exposed, the
        !           910: server either re-executes all requests to the entire window, or only
        !           911: re-executes requests that affect the region while clipping the output to that
        !           912: region.  In the alternative approach, when a window becomes obscured the server
        !           913: saves the obscured region (or perhaps the entire window) in off-screen memory.
        !           914: All subsequent output requests are executed not only to the visible regions of
        !           915: the window, but to the off-screen image as well.  When an obscured region
        !           916: becomes visible again, the off-screen copy is simply restored.
        !           917: 
        !           918: We believe neither server-based approach is acceptable.  With display lists,
        !           919: the server is unlikely to have any reasonable notion of when later output
        !           920: requests nullify earlier ones.  Either the display list becomes unmanageably
        !           921: long, and a refresh that should appear nearly instantaneous instead appears as
        !           922: a slow-motion replay, or the server spends a significant length of time pruning
        !           923: the display list, and normal-case performance is considerably reduced.  One
        !           924: problem with the off-screen image approach is (virtual) memory consumption:  on
        !           925: a 1024 by 1024 8-plane display, just one full-screen image requires one
        !           926: megabyte of storage, and multiple overlapping windows could easily require many
        !           927: times that amount.  Another problem is that the cost of the implementation can
        !           928: be prohibitive.  Consider, for example, the QDSS display@cite(qdss), which has
        !           929: a graphics co-processor.  In the QDSS, display memory is inaccessible to the
        !           930: host processor.  In addition, the co-processor cannot perform operations in
        !           931: host memory, and has relatively little off-screen memory of its own.  The only
        !           932: viable way to maintain off-screen images for displays like the QDSS may be to
        !           933: emulate the co-processor in software.  It can easily take tens of thousands of
        !           934: lines of code to emulate a co-processor, and such emulation may execute orders
        !           935: of magnitude slower than the co-processor.
        !           936: 
        !           937: Our belief is that many applications can take advantage of their own
        !           938: information structures to facilitate rapid redisplay, without the expense of
        !           939: maintaining a distinct display structure or backing-store in the client or the
        !           940: server, and often with even better performance.  (Sapphire@cite(sapphire)
        !           941: permits client refresh for this reason.)  For example, a text editor can
        !           942: redisplay directly from the source, and a VLSI editor can redisplay directly
        !           943: from the layout and component definitions.  Many applications will be built on
        !           944: top of high-level graphics libraries that automatically maintain the data
        !           945: structures necessary to implement rapid redisplay.  For example, the structured
        !           946: display file mechanism in VGTS could be supported in a client library.  Of
        !           947: course, pushing the responsibility back on the application may not simplify
        !           948: matters, particularly when retrofitting old systems to a new environment.  For
        !           949: example, the current GKS design does not provide adequate hooks for automatic,
        !           950: system-generated refresh of application windows, nor does it provide an
        !           951: adequate mechanism for forcing refresh back on the application.
        !           952: 
        !           953: Relying on client-controlled refresh also derives from window management
        !           954: philosophy.  Our belief is that applications cannot be written with fixed
        !           955: top-level window sizes built in.  Rather, they must function correctly with
        !           956: almost any size, and continue to function correctly as windows are dynamically
        !           957: resized.  This is necessary if applications are to be usable on a variety of
        !           958: displays under a variety of window management policies.  (Of course, an
        !           959: application may need a minimum size to function reasonably, and may prefer the
        !           960: width or height to be a multiple of some number; X allows the client to attach
        !           961: a resize hint to each window to inform window managers of this.)  Our belief is
        !           962: that most applications, for one reason or another, will already have code for
        !           963: performing a complete redisplay of the window, and that it is usually
        !           964: straightforward to modify this code to deal with partial exposures.  Similar
        !           965: arguments were used in the design of both Andrew and Mex, and experience has
        !           966: confirmed their decision@cite(wm,mex).
        !           967: 
        !           968: This is not to argue that the server should never maintain window contents,
        !           969: only that it should not be @i(required) to maintain contents.  For complex
        !           970: imaging and graphics applications, efficient maintenance by the server may be
        !           971: critical for acceptable performance of window management functions.  There is
        !           972: nothing inherent in the X protocol that precludes the server from maintaining
        !           973: window contents and not generating exposure events.  In the next version of X,
        !           974: windows will have several attributes to advise the server as to when and how
        !           975: contents should be maintained.
        !           976: 
        !           977: In X, clients are never informed of what regions are obscured, only of what
        !           978: regions have become visible.  Thus, clients have insufficient information to
        !           979: try and optimize output by only drawing to visible regions.  However, we feel
        !           980: this is justified on two grounds.  First, realistically, users seldom stack
        !           981: windows such that the active ones are obscured, so there is little point in
        !           982: complicating applications to optimize this case.  More importantly, allowing
        !           983: applications to restrict output to only visible regions would conflict with the
        !           984: desire to have the server maintain obscured regions automatically when
        !           985: possible.
        !           986: 
        !           987: An interesting complication with the CopyArea request (described in Section 6)
        !           988: arises, having decided on client refresh.  If part of the source region of the
        !           989: CopyArea is obscured, then not all of the destination region can be updated
        !           990: properly, and the client must be notified (with an exposure event) so that it
        !           991: can correct the problem.  Since output requests are asynchronous, care must be
        !           992: taken by the application to handle exposure events when using CopyArea.  In
        !           993: particular, if a region is exposed and an event sent by the server, a
        !           994: subsequent CopyArea may move all or part of the region before the event is
        !           995: actually received by the application.  Several simple algorithms have been
        !           996: designed to deal with this situation, but we will not present them here.
        !           997: 
        !           998: Client refresh raises a visual problem in a network environment.  When a region
        !           999: of a window becomes exposed, what contents should the server initially place in
        !          1000: that window?  In a local, tightly-coupled environment, it might be perfectly
        !          1001: reasonable to leave the contents unaltered, because the client can almost
        !          1002: instantaneously begin to refresh the region.  In a network environment however
        !          1003: (and even in a local system where processes can get "swapped out" and take
        !          1004: considerable time to swap back in), inevitable delays can lead to visually
        !          1005: confusing results.  For example, the user may move a window, and see two images
        !          1006: of the window on the screen for a significant length of time, or resize a
        !          1007: window and see no immediate change in the appearance of the screen.
        !          1008: 
        !          1009: To avoid such anomalies in X, clients must define a @i(background) for every
        !          1010: window.  The background can be a single color, or it can be a tiling pattern.
        !          1011: Whenever a region of a window is exposed, the server immediately paints the
        !          1012: region with the background.  Users therefore see window shapes immediately,
        !          1013: even if the "contents" are slow to arrive.  Of course, many application windows
        !          1014: have some notion of a background anyway, so having the server initialize with a
        !          1015: background seldom results in extraneous redisplay.  In fact, many non-leaf
        !          1016: windows typically contain nothing but a background, and having the server paint
        !          1017: that background frees the applications from performing any redisplay at all to
        !          1018: those windows.
        !          1019: 
        !          1020: Although we believe client-generated refresh is acceptable most of the time, it
        !          1021: does not always perform well with momentary pop-up menus, where speed is at a
        !          1022: premium.  To avoid potentially expensive refresh when a menu is removed from
        !          1023: the screen, a client can explicitly copy the region to be covered by the menu
        !          1024: into off-screen memory (within the server) before mapping the menu window.  A
        !          1025: special unmap request is used to remove the menu:  it unmaps the window without
        !          1026: affecting the contents of the screen or generating exposure events.  The
        !          1027: original contents are then copied back onto the screen.  In addition, the
        !          1028: client usually @i(grabs) the server for the entire sequence, using a request
        !          1029: which freezes all other clients until a corresponding ungrab request is issued
        !          1030: (or the grabbing client terminates).  Without this, concurrent output from
        !          1031: other clients to regions obscured by the menu would be lost.  Although freezing
        !          1032: other clients is in general a poor idea, it seems acceptable for momentary
        !          1033: menus.
        !          1034: 
        !          1035: @section(Input)
        !          1036: 
        !          1037: We now turn to a discussion of input events, but first we briefly describe the
        !          1038: support for mouse cursors.  Clients can define arbitrary shapes for use as
        !          1039: mouse cursors.  A cursor is defined by a source bitmap, a pair of pixel values
        !          1040: with which to display the bitmap, a mask bitmap which defines the precise shape
        !          1041: of the image, and a coordinate within the source bitmap which defines the
        !          1042: "center" or "hot spot" of the cursor.  Cursors of arbitrary size can be
        !          1043: constructed, although only a portion of the cursor may be displayed on some
        !          1044: hardware.  Clients can query the server to determine what cursor sizes are
        !          1045: supported, but existing applications typically just assume a 16 by 16 image can
        !          1046: always be displayed.  Cursors also can be constructed from character images in
        !          1047: fonts; this provides a simple form of named indirection, allowing custom
        !          1048: tailoring to each display without having to modify the applications.
        !          1049: 
        !          1050: A window is said to @i(contain) the mouse if the hot spot of the cursor is
        !          1051: within a visible portion of the window or one of its subwindows.  The mouse is
        !          1052: said to be @i(in) a window if the window contains the mouse but no subwindow
        !          1053: contains the mouse.  Every window can have a mouse cursor defined for it.  The
        !          1054: server automatically displays the cursor of whatever window the mouse is
        !          1055: currently in; if the window has no cursor defined, the server displays the
        !          1056: cursor of the closest ancestor with a cursor defined.
        !          1057: 
        !          1058: Input is associated with windows.  Input to a given window is controlled by a
        !          1059: single client, which need not be the client that created the window.  Events
        !          1060: are classified into various types, and the controlling client selects which
        !          1061: types are of interest to it.  Only events matching in type with this selection
        !          1062: are sent to the client.  When an input event is generated for a window and the
        !          1063: controlling client has not selected that type, the server @i(propagates) the
        !          1064: event to the closest ancestor window for which some client has selected the
        !          1065: type, and sends the event to that client instead.  Every event includes the
        !          1066: window that had the event type selected; this window is called the @i(event
        !          1067: window).  If the event has been propagated, the event also includes the next
        !          1068: window down in the hierarchy between the event window and the original window
        !          1069: on which the event was generated.
        !          1070: 
        !          1071: @subsection(The Keyboard)
        !          1072: 
        !          1073: For the keyboard, a client can selectively receive events on the press or
        !          1074: release of a key.  Keyboard events are not reported in terms of ASCII character
        !          1075: codes; instead, each key is assigned a unique code, and client software must
        !          1076: translate these codes into the appropriate characters.  The mapping from
        !          1077: keycaps to keycodes is intended to be "universal" and predefined; a given
        !          1078: keycap has the same keycode on all keyboards.  Applications generally have been
        !          1079: written to read a "keymap file" from the user's home directory, so that users
        !          1080: can remap the keyboard as they see fit.
        !          1081: 
        !          1082: The use of coded keys is secondary to the ability to detect both up and down
        !          1083: transitions on the keyboard.  For example, a common trick in window systems is
        !          1084: for mouse button operations to be affected by keyboard @i(modifiers) such as
        !          1085: the Shift, Control, and Meta keys.  A useful feature of the Genera@cite(genera)
        !          1086: system is the use of a "mouse documentation line", which changes dynamically as
        !          1087: modifiers are pressed and released, indicating the function of the mouse
        !          1088: buttons.  A base window system must provide this capability.  Transitions are
        !          1089: not only useful on modifiers; various applications for systems other than X
        !          1090: have been designed to use "chords" (groups of keys pressed simultaneously), and
        !          1091: again the window system should support them.
        !          1092: 
        !          1093: The keyboard is always @i(attached) to some window (typically the root window
        !          1094: or a top-level window); we call this window the @i(focus) window.  A request
        !          1095: can be used (usually by the input manager) to attach the keyboard to any
        !          1096: window.  The window that receives keyboard input depends on both the mouse
        !          1097: position and the focus window.  If the mouse is in some descendant of the focus
        !          1098: window, that descendant receives the input.  If the mouse is not in a
        !          1099: descendant of the focus window, then the focus window receives the input, even
        !          1100: if the mouse is outside the focus window.  For applications that wish to have
        !          1101: the mouse state modify the effect of keyboard input, a keyboard event contains
        !          1102: the mouse coordinates, both relative to the event window and global to the
        !          1103: screen, as well as the state of the mouse buttons.
        !          1104: 
        !          1105: To provide a reasonable user interface, keyboard events also contain the state
        !          1106: of the most common modifier keys:  Shift, ShiftLock, Control, and Meta.
        !          1107: Without this information, anomalous behavior can result.  If the user switches
        !          1108: windows while modifier keys are down, the new client must somehow determine
        !          1109: which modifiers are down.  Placing the modifier state in the keyboard events
        !          1110: solves such problems, and also has another benefit:  most clients do not have
        !          1111: to maintain their own shadow of the modifier state, and so often can completely
        !          1112: ignore key release events.  However, there is a conflict between this
        !          1113: server-maintained state and client-maintained keyboard mappings.  In
        !          1114: particular, clients cannot use non-standard keys as modifiers, or use chords
        !          1115: without the possibility of anomalies such as described above.  We believe the
        !          1116: correct solution (not yet supported in X) is for the server to maintain a bit
        !          1117: mask reflecting the full state of the keyboard, and to allow clients to read
        !          1118: this mask.  An application using chords or non-standard modifiers would request
        !          1119: the server to send this mask automatically whenever the mouse entered the
        !          1120: application's window.
        !          1121: 
        !          1122: @subsection(The Mouse)
        !          1123: 
        !          1124: The X protocol is (somewhat arbitrarily) designed for mice with up to three
        !          1125: buttons.  An application can selectively receive events on the press or release
        !          1126: of each button.  Each event contains the current mouse coordinates (both local
        !          1127: to the window and global to the screen), the current state of all buttons and
        !          1128: modifier keys, and a timestamp which can be used, for example, to decide when a
        !          1129: succession of clicks constitutes a double or triple click.  An application can
        !          1130: also choose to receive mouse motion events, either whenever the mouse is in the
        !          1131: window, or only when particular buttons have also been pressed.  The
        !          1132: application cannot control the granularity of the reporting, nor is any minimum
        !          1133: granularity guaranteed.  In fact, typical server implementations make an effort
        !          1134: to compact motion events, to minimize system overhead and wired memory in
        !          1135: device drivers.  As such, X may not serve adequately for fine-grained tracking,
        !          1136: such as in fast moving free-hand drawing applications.
        !          1137: 
        !          1138: Even with motion compaction, servers can generate considerable numbers of
        !          1139: motion events.  If an application attempts to respond in real time to every
        !          1140: event, it can easily get far behind relative to the actual position of the
        !          1141: mouse.  Instead, many applications simply treat motion events as hints.  When a
        !          1142: motion event is received, the event is simply discarded, and the client then
        !          1143: explicitly queries the server for the current mouse position.  In waiting for
        !          1144: the reply, more motion events may be received; these are also discarded.  The
        !          1145: client then reacts based on the queried mouse position.  The advantage of this
        !          1146: scheme over continuously polling the mouse position is that no CPU time is
        !          1147: consumed while the mouse is stationary.
        !          1148: 
        !          1149: Clients can also receive an event each time the mouse enters or leaves a
        !          1150: window.  This can be particularly useful in implementing menus.  For example,
        !          1151: each menu item can be placed in a separate subwindow of the overall menu
        !          1152: window.  When the mouse enters a subwindow, the item is highlighted in some
        !          1153: fashion (e.g., by inverting the video sense), and when the mouse leaves the
        !          1154: window the item is restored to normal.  Implementing a menu in this manner
        !          1155: requires considerably less CPU overhead than continuous polling of the mouse,
        !          1156: and also less overhead than using motion events, since most motion events would
        !          1157: be within windows and thus uninteresting.
        !          1158: 
        !          1159: Due to the nature of overlapping windows, and because continuous tracking by
        !          1160: the server is not guaranteed, the mouse may appear to move instantaneously
        !          1161: between any pair of windows on the screen.  Certainly the window the mouse was
        !          1162: in should be notified of the mouse leaving, and the window the mouse is now in
        !          1163: should be notified of the mouse entering.  However, all of the windows "in
        !          1164: between" in the hierarchy may also be interested in the transition.  This is
        !          1165: useful in simplifying the structure of some applications, and is necessary in
        !          1166: implementing certain kinds of window managers and input managers.  Thus, when
        !          1167: the mouse moves from window A to window B, with window W as their closest
        !          1168: (least) common ancestor, all ancestors of A below W also receive leave events,
        !          1169: and all ancestors of B below W receive enter events.
        !          1170: 
        !          1171: Except for mouse motion events, it might be argued that events are infrequent
        !          1172: enough that the server should always send all events to the client, and
        !          1173: eliminate the complexity of selecting events.  However, some applications are
        !          1174: written with interrupt-driven input; events are received asynchronously, and
        !          1175: cause the current computation to be suspended so that the input can be
        !          1176: processed.  For example, a text editor might use interrupt-driven input, with
        !          1177: the normal computation being redisplay of the window.  The receipt of
        !          1178: extraneous input events (for example, key release events) can cause noticeable
        !          1179: "hiccups" in such redisplay.
        !          1180: 
        !          1181: @section(Input and Window Management)
        !          1182: 
        !          1183: There are two basic modes of keyboard management:  @i(real-estate) and
        !          1184: @i(listener).  In real-estate mode, the keyboard "follows" the mouse; keyboard
        !          1185: input is directed to whatever window the mouse is in.  In listener mode,
        !          1186: keyboard input is directed to a specific window, independent of the mouse
        !          1187: position.  Some systems provide only real-estate mode@cite(apollo,sunwin), some
        !          1188: only listener mode@cite(lucasfilm,sapphire,pnx,mex,mg1,genera), and
        !          1189: Andrew@cite(wm) provides both, although the mode cannot be changed during a
        !          1190: session.  Both modes are supported in X, and the mode can be changed
        !          1191: dynamically.  Real-estate mode is the default behavior, with the root window as
        !          1192: the focus window, as described in the previous section.  An input manager can
        !          1193: also make some other (typically top-level) window the focus window, yielding
        !          1194: listener mode.  Note however, that in listener mode in X, the client
        !          1195: controlling the focus window can still get real-estate behavior for subwindows,
        !          1196: if desired; this capability has proven useful in several applications.
        !          1197: 
        !          1198: The primary function of a window manager is reconfiguration:  restacking,
        !          1199: resizing, and repositioning top-level windows.  The configuration of nested
        !          1200: windows is assumed to be application-specific, and under control of the
        !          1201: applications.  There are two broad categories of window managers:  manual and
        !          1202: automatic.  A manual window manager is "passive", and simply provides an
        !          1203: interface to allow the user to manipulate the desktop; windows can be resized
        !          1204: and reorganized at will.  The initial size and position of a window typically
        !          1205: (but not always) is under user or application control.  Automatic window
        !          1206: managers are "active", and operate for the most part without human interaction;
        !          1207: size and position at window creation, and reconfiguration at window
        !          1208: destruction, are chosen by the system.  Automatic managers typically tile the
        !          1209: screen with windows, such that no two windows overlap, automatically adjusting
        !          1210: the layout as windows are created and destroyed.  Andrew@cite(wm),
        !          1211: Star@cite(star2), and Cedar@cite(cedar) provide automatic management, plus
        !          1212: limited manual reconfiguration capability.
        !          1213: 
        !          1214: Existing window managers for X are manual.  Automatic management that is
        !          1215: transparent to applications cannot be accomplished reasonably in X; future
        !          1216: support for automatic management is discussed in Section 10.  In the current X
        !          1217: design, clients are responsible for initially sizing and placing their
        !          1218: top-level windows, not window managers.  In this way, applications continue to
        !          1219: work when no window manager is present.  Typically, the user either specifies
        !          1220: geometry information in the application command line, or uses the mouse to
        !          1221: sweep out a rectangle on the screen.  (For the latter, the application grabs
        !          1222: the mouse, as described below.)
        !          1223: 
        !          1224: @subsection(Mouse-Driven Management)
        !          1225: 
        !          1226: Existing managers are primarily mouse-driven, and are based on the ability to
        !          1227: "steal" events.  Specifically, a manager (or any other client) can @i(grab) a
        !          1228: mouse button in combination with a set of modifier keys, with the following
        !          1229: effect.  Whenever the modifier keys are down and the button is pressed, the
        !          1230: event is reported to the grabbing client, regardless of what window the mouse
        !          1231: is in.  All mouse-related events continue to be sent to that client until the
        !          1232: button is released.  As part of the grab, the client also specifies a mouse
        !          1233: cursor to be used for the duration of the grab, and a window to be used as the
        !          1234: event window.  A manager specifies the root window as the event window when
        !          1235: grabbing buttons; with the event propagation semantics described in Section 8,
        !          1236: the grabbed events contain not only the global mouse coordinates, but also the
        !          1237: top-level application window (if any) containing the mouse.  This is sufficient
        !          1238: information to manipulate top-level windows.
        !          1239: 
        !          1240: Using this button-grab mechanism, several different management interfaces have
        !          1241: been built, including a "programmable" interface@cite(uwm) allowing the user to
        !          1242: assign individual commands or user-defined menus of commands to any number of
        !          1243: button/modifier combinations.  For example, a button click (press and release
        !          1244: without intervening motion) might be interpreted as a command to raise or lower
        !          1245: a window, or to attach the keyboard; a press/motion/release sequence might be
        !          1246: interpreted as a command to move a window to a new position; or a button press
        !          1247: might cause a menu to pop up, with the selection indicated by the mouse
        !          1248: position at the release of the button.  By allowing both specific commands and
        !          1249: menus to be bound to buttons, a range of interfaces can be constructed to
        !          1250: satisfy both "expert" and "novice" users.
        !          1251: 
        !          1252: Another form of manager simply displays a static menu bar along the top of the
        !          1253: screen, with items for such operations as moving a window and attaching the
        !          1254: keyboard.  The menu is used in combination with a mouse-grab primitive, with
        !          1255: which a client can unilaterally grab the mouse and then later explicitly
        !          1256: release it; during such a mouse-grab, events are redirected to the grabbing
        !          1257: client, just as for button-grabs.  When the user clicks on a menu bar item with
        !          1258: any button, the manager unilaterally grabs the mouse.  The user then uses the
        !          1259: mouse to execute the specific command.  For example, having clicked on the
        !          1260: "move" item, the user indicates the window to move by placing the mouse in the
        !          1261: window and pressing a button, then indicates the new position by moving the
        !          1262: mouse and releasing the button.  The manager then releases the mouse.
        !          1263: 
        !          1264: @subsection(Icons)
        !          1265: 
        !          1266: One important "resizing" operation performed by a window manager is
        !          1267: transforming a window into a small icon and back again.  In X, icons are merely
        !          1268: windows.  Transforming a window into an icon simply involves unmapping the
        !          1269: window and mapping its associated icon.  The association between a window and
        !          1270: its icon is maintained in the server, rather than the window manager, and
        !          1271: either the application or the manager can provide the icon.  In this way, the
        !          1272: manager can provide a default icon form for most clients, but clients can
        !          1273: provide their own if desired, possibly with dynamic rather than static
        !          1274: contents.  The client is still insulated from management policy, even if it
        !          1275: provides the icon:  the manager is responsible for positioning, mapping, and
        !          1276: unmapping the icon, and the client is responsible only for displaying the
        !          1277: contents.
        !          1278: 
        !          1279: The icon state is maintained in the server not only to allow clients to provide
        !          1280: icons, but to avoid the loss of state if the window manager should terminate
        !          1281: abnormally.  When a window manager terminates, any windows it has created are
        !          1282: destroyed, including icon windows.  With knowledge of icons, the server can
        !          1283: detect when an icon is destroyed, and automatically remap the associated client
        !          1284: window.  Without this, abnormal termination of the window manager would result
        !          1285: in "lost" windows.
        !          1286: 
        !          1287: @subsection(Race Conditions)
        !          1288: 
        !          1289: There are many race conditions that must be dealt with in input and window
        !          1290: management, due to the asynchronous nature of event handling.  For example, if
        !          1291: a manager attempts to grab the mouse in response to a press of a button, the
        !          1292: mouse-grab request might not reach the server until after the button is
        !          1293: released, and intervening mouse events would be missed.  Or, if the user clicks
        !          1294: on a window to attach the keyboard there, and then immediately begins typing,
        !          1295: the first few keystrokes might occur before the manager actually responds to
        !          1296: the click and the server actually moves the keyboard focus.  A final example is
        !          1297: a simple interface in which clicking on a window lowers it.  Given a stack of
        !          1298: three windows, the user might rapidly click twice in the same spot, expecting
        !          1299: the top two windows to be lowered.  Unless the first click is sent to the
        !          1300: manager and the resulting request to lower is processed by the server before
        !          1301: the second click takes place, the event window for the second click will be the
        !          1302: same as for the first click, and the manager will lower the first window twice.
        !          1303: 
        !          1304: A work-around for the last example, used by existing managers, is to ignore the
        !          1305: event window reported in most events.  Instead, the global mouse coordinates
        !          1306: reported in the event are used in a follow-up query request to determine which
        !          1307: top-level window now contains that coordinate.  However, not all race
        !          1308: conditions have acceptable solutions within the current X design.  For a
        !          1309: general solution, it must be possible for the manager to synchronize operations
        !          1310: explicitly with event processing in the server.  For example, a manager might
        !          1311: specify that, at the press of a button, event processing in the server should
        !          1312: cease until an explicit acknowledgment is received from the manager.
        !          1313: 
        !          1314: @section(Future)
        !          1315: 
        !          1316: Based on critiques from numerous universities and commercial firms, a fairly
        !          1317: extensive evaluation and redesign of the X protocol has been underway since May
        !          1318: 1986.  Our desire is to define a "core" protocol that can serve as a standard
        !          1319: for window system construction over the next several years.  We expect to
        !          1320: present the rationale for this new design in the very near future, once it has
        !          1321: been validated by at least a preliminary implementation.  In this section, we
        !          1322: highlight the major protocol changes.
        !          1323: 
        !          1324: @subsection(Resource Allocation)
        !          1325: 
        !          1326: Since the server is responsible for assigning identifiers to resources, each
        !          1327: resource allocation currently requires a round-trip time to perform.  For
        !          1328: applications that allocate many resources, this causes a considerable start-up
        !          1329: delay.  For example, a multi-pane menu might consist of dozens of windows,
        !          1330: numerous fonts, and several different mouse cursors, leading to a delay of one
        !          1331: second or longer.
        !          1332: 
        !          1333: In retrospect, this is the most significant defect in the design of X.  To get
        !          1334: around these delays, programming interfaces have been augmented to provide
        !          1335: "batch mode" operations.  If several resources must be created, but there are
        !          1336: no inter-dependencies among the allocation requests, all of the requests are
        !          1337: sent in a batch, and then all of the replies are received.  This effectively
        !          1338: reduces the delay to a single round-trip time.
        !          1339: 
        !          1340: A better solution to this problem is to make clients generate the identifiers.
        !          1341: When the client establishes a connection to the server, it is given a specific
        !          1342: subrange from which it can allocate.  This change will significantly improve
        !          1343: start-up times without affecting applications, as identifiers can be generated
        !          1344: inside low-level libraries without changing programming interfaces.
        !          1345: 
        !          1346: @subsection(Transparent Windows)
        !          1347: 
        !          1348: One use of transparent windows is as clipping regions.  However, they are
        !          1349: unsatisfactory for this purpose because every coordinate in a graphics request
        !          1350: must be translated by the client from the "real" window's origin to the
        !          1351: transparent window's origin.  A better approach to clipping regions is to allow
        !          1352: clients to create clipping regions and attach them to all graphics requests.
        !          1353: As noted in Section 6, X currently allows a clipping region in the form of a
        !          1354: bitmap to be attached to a few graphics requests.  Allowing a clipping region,
        !          1355: specified either as a bitmap or a list of rectangles, to be attached to all
        !          1356: graphics requests provides a more uniform mechanism.
        !          1357: 
        !          1358: The major use of transparent windows to date is actually as inexpensive opaque
        !          1359: windows.  In the current server implementation, transparent windows can be
        !          1360: created and transformed significantly faster than opaque windows.  Because of
        !          1361: this, transparent windows are often used when opaque windows would otherwise be
        !          1362: adequate.  We believe a new implementation of the server will improve the
        !          1363: performance of opaque windows to the point that this will no longer be
        !          1364: necessary.
        !          1365: 
        !          1366: With explicit clipping regions added for graphics, and the performance
        !          1367: advantages of transparent windows reduced, the only remaining use of
        !          1368: transparent windows is for input (and cursor) control.  Various applications
        !          1369: want relatively fine-grained input control, and such control must not affect
        !          1370: graphics output.  Close control of cursor images and mouse motion events seems
        !          1371: particularly important.  However, the vast majority of the time control
        !          1372: naturally is associated with normal window boundaries, so it would be unwise to
        !          1373: divorce input control completely from windows.  As such, the new protocol
        !          1374: provides "input-only" windows, which act like normal windows for the purposes
        !          1375: of input and cursor control, but which cannot be used as a source or
        !          1376: destination in graphics requests, and which are completely invisible as far as
        !          1377: output is concerned.
        !          1378: 
        !          1379: @subsection(Color)
        !          1380: 
        !          1381: X originally was not designed to deal with direct-color displays.  Direct-color
        !          1382: displays typically have between 12 and 36 bits per pixel; the pixel value
        !          1383: consists of three subfields, which are used as indexes into three independent
        !          1384: color maps: one for red intensities, one for green, and one for blue.  Some
        !          1385: direct-color displays also have a fourth subfield, sometimes referred to as
        !          1386: "z-channel" information, used to control attributes such as blending or chroma
        !          1387: keying.  We now understand how to incorporate direct-color displays without
        !          1388: z-channel information into X, in such a way that the differences between
        !          1389: direct-color and pseudo-color color maps need not be apparent to the
        !          1390: application, yet still allowing all of the usual color map tricks to played.
        !          1391: 
        !          1392: At present there is only one color map for all applications, and color
        !          1393: applications fail when this map gets full.  Although dozens of applications
        !          1394: typically can be run under X within a single 8-bit pseudo-color map, a single
        !          1395: map is clearly unacceptable when dealing with small color maps, or with
        !          1396: multiple applications (e.g., CAD tools) that need large portions of the color
        !          1397: map.  The solution is to support multiple virtual color maps, still permitting
        !          1398: applications to coexist within any map, but allowing the possibility that not
        !          1399: all applications show true color simultaneously.  This also matches
        !          1400: next-generation displays, which actually support multiple color maps in
        !          1401: hardware@cite(rainbow).
        !          1402: 
        !          1403: @subsection(Graphics)
        !          1404: 
        !          1405: Perhaps the biggest mistake in the graphics area was failing to support fonts
        !          1406: with kerning (side bearings).  For example, a relatively complete emulation of
        !          1407: the Andrew programming interface was built for X, but Andrew applications
        !          1408: depend heavily on kerned fonts.  There are other deficiencies that will be
        !          1409: corrected.  For example, large glyph-sets (e.g., Japanese) will be supported,
        !          1410: as well as stippling (using a clip mask constructed by tiling a region with a
        !          1411: bitmap).  The notions of line width, join style, and end style found in
        !          1412: PostScript@cite(postscript) are usually preferred to brush shapes for line
        !          1413: drawing, and will be supported.
        !          1414: 
        !          1415: In an attempt to support a wide range of devices, the exact path followed for
        !          1416: lines and filled shapes was originally left undefined in X (the class of curve
        !          1417: was not even specified).  Different devices use slightly different algorithms
        !          1418: to draw straight lines, and it seemed better to have high performance with
        !          1419: minor variation than to have uniformity with poor performance.  Relatively few
        !          1420: devices support curve drawing in hardware, but some support it in firmware, and
        !          1421: again performance seemed more important than accuracy.  In retrospect, however,
        !          1422: allowing such device dependent behavior was a poor decision.  The vast majority
        !          1423: of applications draw lines aligned on an axis, and speed and precision are not
        !          1424: an issue.  The applications that do require complex shapes also require
        !          1425: predictable results, so precise specifications are important.
        !          1426: 
        !          1427: A notable feature missing in X is the ability to perform graphics operations
        !          1428: off screen.  The reasons for this are essentially the same as those presented
        !          1429: when discussing exposures in Section 7.  In particular, not all graphics
        !          1430: co-processors can operate on host memory, and emulating such processors can be
        !          1431: expensive.  However, application builders have demanded this capability, and
        !          1432: the demand appears to be sufficient leverage to convince server implementors to
        !          1433: provide the capability.  Off-screen graphics will be possible in the new
        !          1434: protocol, although the amount of off-screen memory and its performance
        !          1435: characteristics may vary widely.  In addition, the protocol is being extended
        !          1436: to allow the manipulation of both images and windows of varying depths.  For
        !          1437: example, a server might support depths of 1, 4, 8, 12, and 24 bits.  This
        !          1438: allows imaging applications to transmit data more compactly, allows for more
        !          1439: efficient memory utilization in the server, and provides a match with
        !          1440: next-generation display hardware.
        !          1441: 
        !          1442: A common debate in graphics systems is whether and where to have state.  Should
        !          1443: parameters such as logic function, plane mask, source pixel value or tile,
        !          1444: tiling origin, font, line width and style, and clipping region be explicit in
        !          1445: every request or collected into a state object?  The current X protocol is
        !          1446: stateless, for the following reasons:  both state and stateless programming
        !          1447: interfaces can be built easily on top of the protocol; the currently supported
        !          1448: graphics requests have just few enough parameters that they can be represented
        !          1449: compactly; and the initial set of displays we were interested in (and the
        !          1450: implementations we had in mind for them) would not benefit from the addition of
        !          1451: state.  However, we now believe that a state-based protocol is generally
        !          1452: superior, as it handles complex graphics gracefully and allows significantly
        !          1453: faster implementations on some displays.
        !          1454: 
        !          1455: @subsection(Management)
        !          1456: 
        !          1457: An obvious interface style presently not supported in X is the ability to use
        !          1458: the keyboard for management commands.  To allow this, a key-grab mechanism,
        !          1459: akin to the button-grab mechanism described in Section 9, will be provided.  To
        !          1460: allow such styles as using the first button click in a window to attach the
        !          1461: keyboard, both button-grabs and key-grabs have been extended to apply to
        !          1462: specific sub-hierarchies, rather than always to the entire screen.  To handle
        !          1463: the kinds of race conditions described in Section 9, a general event
        !          1464: synchronization mechanism has been incorporated into the grab mechanisms.
        !          1465: 
        !          1466: To support automatic window management, a manager must be able to intercept
        !          1467: certain management requests from clients (such as mapping or moving a window)
        !          1468: before they are executed by the server, and to be notified about others (such
        !          1469: as unmapping a window) after they are executed.  In addition, some managers
        !          1470: want to provide uniform title bars and border decorations automatically.  To
        !          1471: allow this, it is useful to be able to "splice" hierarchies:  to move a window
        !          1472: from one parent to another.  To allow input managers and window managers to be
        !          1473: implemented as separate applications, the ability for multiple clients to
        !          1474: select events on the same window is being added.  For example, both a window
        !          1475: manager and an input manager might be interested in the unmapping or
        !          1476: destruction of a window.
        !          1477: 
        !          1478: @subsection(Extensibility)
        !          1479: 
        !          1480: The information that input and window managers might desire from applications
        !          1481: is quite varied, and it would be a mistake to try and define a fixed set.
        !          1482: Similarly, the information paths between applications (e.g., in support of "cut
        !          1483: and paste") need to be flexible.  To this end, we are adding a Lisp-ish
        !          1484: property list@cite(CLtL) mechanism to windows, and the event mechanism is being
        !          1485: augmented to provide a simple form of inter-client communication.
        !          1486: 
        !          1487: The new X protocol explicitly continues to avoid certain areas, such as 3-D
        !          1488: graphics and anti-aliasing.  However, a general mechanism has been designed to
        !          1489: allow extension libraries to be included in a server.  The intention is that
        !          1490: all servers implement the "core" protocol, but each server can provide
        !          1491: arbitrary extensions.  If an extension becomes widely accepted by the X
        !          1492: community, it can be adopted as part of the core.  Each extension library is
        !          1493: assigned a global name, and an application can query the server at run-time to
        !          1494: determine if a particular extension is present.  Request opcodes and event
        !          1495: types are allocated dynamically, so that applications need not be modified to
        !          1496: execute in each new environment.
        !          1497: 
        !          1498: @section(Summary)
        !          1499: 
        !          1500: The X Window System provides high-performance, high-level, device-independent
        !          1501: graphics.  A hierarchy of resizable, overlapping windows allows a wide variety
        !          1502: of application and user interfaces to be built easily.  Network-transparent
        !          1503: access to the display provides an important degree of functional separation,
        !          1504: without significantly affecting performance, that is crucial to building
        !          1505: applications for a distributed environment.  To a reasonable extent, desktop
        !          1506: management can be custom tailored to individual environments, without modifying
        !          1507: the base system and typically without affecting applications.
        !          1508: 
        !          1509: To date, the X design and implementation effort has focused on the base window
        !          1510: system, as described in this paper, and in essential applications and
        !          1511: programming interfaces.  The design of the network protocol, the design and
        !          1512: implementation of device-independent layer of server, and the implementation of
        !          1513: several applications and a prototype window manager, were carried out by the
        !          1514: first author.  The design and implementation of the C programming interface,
        !          1515: the implementation of major portions of several applications, and the
        !          1516: coordination of efforts within Project Athena and Digital, were carried out by
        !          1517: the second author.  In addition, many other persons from Project Athena, the
        !          1518: Laboratory for Computer Science, and institutions outside MIT have contributed
        !          1519: software.
        !          1520: 
        !          1521: Necessary applications such as window managers and VT100 and Tektronics 4014
        !          1522: terminal emulators have been created, and numerous existing applications, such
        !          1523: as text editors and VLSI layout systems, have been ported to the X environment.
        !          1524: Although several different menu packages have been implemented, we are only now
        !          1525: beginning to see a rich library of tools (scroll bars, frames, panels, more
        !          1526: menus, etc.) to facilitate the rapid construction of high-quality user
        !          1527: interfaces.  Tool building is taking place at many sites, and several
        !          1528: universities are now attempting to unify window systems work with X as a base,
        !          1529: so that such tools can be shared.
        !          1530: 
        !          1531: The use of X has grown far beyond anything we had imagined.  Digital has
        !          1532: incorporated X into a commercial product, and other manufacturers are following
        !          1533: suit.  With the appearance of such products, and the release of complete X
        !          1534: sources on the Berkeley 4.3 Unix distribution tapes, it is no longer feasible
        !          1535: to track all X use and development.  Existing applications written in C are
        !          1536: known to have been ported to seven machine architectures of more than twelve
        !          1537: manufacturers, and the C server to six machine architectures and more than
        !          1538: sixteen display architectures.  In most cases the code is running under Unix,
        !          1539: but other operating systems are also involved.  In addition, relatively
        !          1540: complete server implementations exist in two Lisp dialects.  Apart from
        !          1541: designing the system to be portable, a large part of this success is due to
        !          1542: MIT's decision to distribute X sources without any licensing restrictions, and
        !          1543: the willingness of people in both educational and commercial institutions to
        !          1544: contribute code without restrictions.
        !          1545: 
        !          1546: @b(Acknowledgments)
        !          1547: 
        !          1548: Our thanks go to the many people who have contributed to the success of X.
        !          1549: Particular thanks go to those who have made significant contributions to the
        !          1550: non-proprietary implementation:  Paul Asente (Stanford University), Scott Bates
        !          1551: (Brown University), Mike Braca (Brown), Dave Bundy (Brown), Dave Carver
        !          1552: (Digital), Tony Della Fera (Digital), Mike Gancarz (Digital), James Gosling
        !          1553: (Sun Microsystems), Doug Mink (Smithsonian Astrophysical Observatory), Bob
        !          1554: McNamara (Digital), Ron Newman (MIT), Ram Rao (Digital), Dave Rosenthal (Sun),
        !          1555: Dan Stone (Brown), Stephen Sutphen (University of Alberta), and Mark
        !          1556: Vandevoorde (MIT).
        !          1557: 
        !          1558: Special thanks go to Digital Equipment Corporation.  A redesign of the protocol
        !          1559: and a reimplementation of the server to deal with color and to increase
        !          1560: performance was made possible with funding (in the form of hardware) from
        !          1561: Digital.  To their credit, all of the resulting device-independent code
        !          1562: remained the property of MIT.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.