Annotation of 43BSD/contrib/X/doc/Usenix/memory.t, revision 1.1.1.1

1.1       root        1: .SH
                      2: Shared Memory
                      3: .PP
                      4: On a fast display and processor, X may be performing more than
                      5: one thousand operations (X requests) per second.
                      6: If every access to the device requires a system call, the overhead
                      7: rapidly predominates all other costs.
                      8: X uses a shared memory structure with the device driver for two purposes:
                      9: 1) to get mouse and keyboard input
                     10: and
                     11: 2) to access the device or write into a memory bitmap.
                     12: .PP
                     13: As pointed out before, X is a single threaded server.
                     14: Since client programs should be able to overlap with
                     15: the window system as much as possible (remember that you may be
                     16: running applications on other machines), it is particularly
                     17: important to send input events to the correct client as soon
                     18: as possible.
                     19: It is therefore desirable to test if there is input after each
                     20: graphic output operation.
                     21: This test can be performed in only a couple of instructions given shared
                     22: memory, and would otherwise require either one system call/output
                     23: operation (to check for new input) or a compromise in how quickly
                     24: input would be handled.
                     25: .PP
                     26: All input events are put into a shared memory circular buffer; since
                     27: the driver only inserts into the buffer, and X only removes from the
                     28: buffer, synchronization is easy to provide with separate head and tail
                     29: indices (presuming a write to shared memory is atomic).
                     30: .PP
                     31: Output on the QVSS is directly to a mapped bitmap.
                     32: In the case of the Vs100, a piece of the UNIBUS\(dg and a shared DMA buffer
                     33: are statically mapped where both the driver and the X server can access
                     34: them.
                     35: .FS \(dg
                     36: UNIBUS is a trademark of Digital Equipment Corporation.
                     37: .sp
                     38: .FE
                     39: Output requests to the Vs100 are directly formated into this buffer,
                     40: minimizing copying of data.\(dd
                     41: .FS \(dd
                     42: Our thanks go to Phil Karlton, of Digital's Western Research Lab, for
                     43: the first implementation of this mechanism.
                     44: .FE
                     45: This permits the device dependent routines to start I/O transfers without
                     46: system call overhead (by directly accessing device CSR registers),
                     47: and avoids UNIBUS map setup overhead that DMA from user space requires.
                     48: .PP
                     49: These changes dramatically increased performance and improved
                     50: interactive feel when implemented, while greatly reducting CPU overhead.
                     51: Since proper memory sharing primitives are lacking in 4.2BSD,
                     52: it was implemented by making pages readable and writable in system space,
                     53: where they are accessible to any process.
                     54: In theory, any program on the machine could cause a Vs100 implementation to
                     55: machine check (odd byte access in the UNIBUS space), though in practice it
                     56: has never happened.
                     57: None the less, it is the ugliest piece of the current X implementation.
                     58: We are more willing to allow a server process to access hardware
                     59: directly than kernel code,
                     60: as it is much easier to debug user processes than kernel code.
                     61: .PP
                     62: The current X implementation uses a TCP stream both locally and
                     63: remotely, though one could easily use 
                     64: .UX
                     65: domain sockets for the local
                     66: case at the cost of a file descriptor.
                     67: For current applications, the bandwidth limitations (of approximately
                     68: 1 million bits/second on 780 class processor) is not major,
                     69: though faster devices (and image processing applications) would probably
                     70: benefit from implementation of a shared memory path between the X server
                     71: and client applications.
                     72: .PP
                     73: Current shared memory implementations in variants of 
                     74: .UX
                     75: are not sufficient.
                     76: Memory sharing primitives should allow appropriately
                     77: privileged programs to both share memory with other processes and map to
                     78: both kernel space and I/O space.
                     79: Shared libraries (available in some versions of 
                     80: .UX )
                     81: would also increase the options available to window system
                     82: designers (see below).

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.