|
|
1.1 ! root 1: .SH ! 2: Shared Memory ! 3: .PP ! 4: On a fast display and processor, X may be performing more than ! 5: one thousand operations (X requests) per second. ! 6: If every access to the device requires a system call, the overhead ! 7: rapidly predominates all other costs. ! 8: X uses a shared memory structure with the device driver for two purposes: ! 9: 1) to get mouse and keyboard input ! 10: and ! 11: 2) to access the device or write into a memory bitmap. ! 12: .PP ! 13: As pointed out before, X is a single threaded server. ! 14: Since client programs should be able to overlap with ! 15: the window system as much as possible (remember that you may be ! 16: running applications on other machines), it is particularly ! 17: important to send input events to the correct client as soon ! 18: as possible. ! 19: It is therefore desirable to test if there is input after each ! 20: graphic output operation. ! 21: This test can be performed in only a couple of instructions given shared ! 22: memory, and would otherwise require either one system call/output ! 23: operation (to check for new input) or a compromise in how quickly ! 24: input would be handled. ! 25: .PP ! 26: All input events are put into a shared memory circular buffer; since ! 27: the driver only inserts into the buffer, and X only removes from the ! 28: buffer, synchronization is easy to provide with separate head and tail ! 29: indices (presuming a write to shared memory is atomic). ! 30: .PP ! 31: Output on the QVSS is directly to a mapped bitmap. ! 32: In the case of the Vs100, a piece of the UNIBUS\(dg and a shared DMA buffer ! 33: are statically mapped where both the driver and the X server can access ! 34: them. ! 35: .FS \(dg ! 36: UNIBUS is a trademark of Digital Equipment Corporation. ! 37: .sp ! 38: .FE ! 39: Output requests to the Vs100 are directly formated into this buffer, ! 40: minimizing copying of data.\(dd ! 41: .FS \(dd ! 42: Our thanks go to Phil Karlton, of Digital's Western Research Lab, for ! 43: the first implementation of this mechanism. ! 44: .FE ! 45: This permits the device dependent routines to start I/O transfers without ! 46: system call overhead (by directly accessing device CSR registers), ! 47: and avoids UNIBUS map setup overhead that DMA from user space requires. ! 48: .PP ! 49: These changes dramatically increased performance and improved ! 50: interactive feel when implemented, while greatly reducting CPU overhead. ! 51: Since proper memory sharing primitives are lacking in 4.2BSD, ! 52: it was implemented by making pages readable and writable in system space, ! 53: where they are accessible to any process. ! 54: In theory, any program on the machine could cause a Vs100 implementation to ! 55: machine check (odd byte access in the UNIBUS space), though in practice it ! 56: has never happened. ! 57: None the less, it is the ugliest piece of the current X implementation. ! 58: We are more willing to allow a server process to access hardware ! 59: directly than kernel code, ! 60: as it is much easier to debug user processes than kernel code. ! 61: .PP ! 62: The current X implementation uses a TCP stream both locally and ! 63: remotely, though one could easily use ! 64: .UX ! 65: domain sockets for the local ! 66: case at the cost of a file descriptor. ! 67: For current applications, the bandwidth limitations (of approximately ! 68: 1 million bits/second on 780 class processor) is not major, ! 69: though faster devices (and image processing applications) would probably ! 70: benefit from implementation of a shared memory path between the X server ! 71: and client applications. ! 72: .PP ! 73: Current shared memory implementations in variants of ! 74: .UX ! 75: are not sufficient. ! 76: Memory sharing primitives should allow appropriately ! 77: privileged programs to both share memory with other processes and map to ! 78: both kernel space and I/O space. ! 79: Shared libraries (available in some versions of ! 80: .UX ) ! 81: would also increase the options available to window system ! 82: designers (see below).
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.