Annotation of qemu/docs/memory.txt, revision 1.1.1.1

1.1       root        1: The memory API
                      2: ==============
                      3: 
                      4: The memory API models the memory and I/O buses and controllers of a QEMU
                      5: machine.  It attempts to allow modelling of:
                      6: 
                      7:  - ordinary RAM
                      8:  - memory-mapped I/O (MMIO)
                      9:  - memory controllers that can dynamically reroute physical memory regions
                     10:   to different destinations
                     11: 
                     12: The memory model provides support for
                     13: 
                     14:  - tracking RAM changes by the guest
                     15:  - setting up coalesced memory for kvm
                     16:  - setting up ioeventfd regions for kvm
                     17: 
                     18: Memory is modelled as a tree (really acyclic graph) of MemoryRegion objects.
                     19: The root of the tree is memory as seen from the CPU's viewpoint (the system
                     20: bus).  Nodes in the tree represent other buses, memory controllers, and
                     21: memory regions that have been rerouted.  Leaves are RAM and MMIO regions.
                     22: 
                     23: Types of regions
                     24: ----------------
                     25: 
                     26: There are four types of memory regions (all represented by a single C type
                     27: MemoryRegion):
                     28: 
                     29: - RAM: a RAM region is simply a range of host memory that can be made available
                     30:   to the guest.
                     31: 
                     32: - MMIO: a range of guest memory that is implemented by host callbacks;
                     33:   each read or write causes a callback to be called on the host.
                     34: 
                     35: - container: a container simply includes other memory regions, each at
                     36:   a different offset.  Containers are useful for grouping several regions
                     37:   into one unit.  For example, a PCI BAR may be composed of a RAM region
                     38:   and an MMIO region.
                     39: 
                     40:   A container's subregions are usually non-overlapping.  In some cases it is
                     41:   useful to have overlapping regions; for example a memory controller that
                     42:   can overlay a subregion of RAM with MMIO or ROM, or a PCI controller
                     43:   that does not prevent card from claiming overlapping BARs.
                     44: 
                     45: - alias: a subsection of another region.  Aliases allow a region to be
                     46:   split apart into discontiguous regions.  Examples of uses are memory banks
                     47:   used when the guest address space is smaller than the amount of RAM
                     48:   addressed, or a memory controller that splits main memory to expose a "PCI
                     49:   hole".  Aliases may point to any type of region, including other aliases,
                     50:   but an alias may not point back to itself, directly or indirectly.
                     51: 
                     52: 
                     53: Region names
                     54: ------------
                     55: 
                     56: Regions are assigned names by the constructor.  For most regions these are
                     57: only used for debugging purposes, but RAM regions also use the name to identify
                     58: live migration sections.  This means that RAM region names need to have ABI
                     59: stability.
                     60: 
                     61: Region lifecycle
                     62: ----------------
                     63: 
                     64: A region is created by one of the constructor functions (memory_region_init*())
                     65: and destroyed by the destructor (memory_region_destroy()).  In between,
                     66: a region can be added to an address space by using memory_region_add_subregion()
                     67: and removed using memory_region_del_subregion().  Region attributes may be
                     68: changed at any point; they take effect once the region becomes exposed to the
                     69: guest.
                     70: 
                     71: Overlapping regions and priority
                     72: --------------------------------
                     73: Usually, regions may not overlap each other; a memory address decodes into
                     74: exactly one target.  In some cases it is useful to allow regions to overlap,
                     75: and sometimes to control which of an overlapping regions is visible to the
                     76: guest.  This is done with memory_region_add_subregion_overlap(), which
                     77: allows the region to overlap any other region in the same container, and
                     78: specifies a priority that allows the core to decide which of two regions at
                     79: the same address are visible (highest wins).
                     80: 
                     81: Visibility
                     82: ----------
                     83: The memory core uses the following rules to select a memory region when the
                     84: guest accesses an address:
                     85: 
                     86: - all direct subregions of the root region are matched against the address, in
                     87:   descending priority order
                     88:   - if the address lies outside the region offset/size, the subregion is
                     89:     discarded
                     90:   - if the subregion is a leaf (RAM or MMIO), the search terminates
                     91:   - if the subregion is a container, the same algorithm is used within the
                     92:     subregion (after the address is adjusted by the subregion offset)
                     93:   - if the subregion is an alias, the search is continues at the alias target
                     94:     (after the address is adjusted by the subregion offset and alias offset)
                     95: 
                     96: Example memory map
                     97: ------------------
                     98: 
                     99: system_memory: container@0-2^48-1
                    100:  |
                    101:  +---- lomem: alias@0-0xdfffffff ---> #ram (0-0xdfffffff)
                    102:  |
                    103:  +---- himem: alias@0x100000000-0x11fffffff ---> #ram (0xe0000000-0xffffffff)
                    104:  |
                    105:  +---- vga-window: alias@0xa0000-0xbfffff ---> #pci (0xa0000-0xbffff)
                    106:  |      (prio 1)
                    107:  |
                    108:  +---- pci-hole: alias@0xe0000000-0xffffffff ---> #pci (0xe0000000-0xffffffff)
                    109: 
                    110: pci (0-2^32-1)
                    111:  |
                    112:  +--- vga-area: container@0xa0000-0xbffff
                    113:  |      |
                    114:  |      +--- alias@0x00000-0x7fff  ---> #vram (0x010000-0x017fff)
                    115:  |      |
                    116:  |      +--- alias@0x08000-0xffff  ---> #vram (0x020000-0x027fff)
                    117:  |
                    118:  +---- vram: ram@0xe1000000-0xe1ffffff
                    119:  |
                    120:  +---- vga-mmio: mmio@0xe2000000-0xe200ffff
                    121: 
                    122: ram: ram@0x00000000-0xffffffff
                    123: 
                    124: The is a (simplified) PC memory map. The 4GB RAM block is mapped into the
                    125: system address space via two aliases: "lomem" is a 1:1 mapping of the first
                    126: 3.5GB; "himem" maps the last 0.5GB at address 4GB.  This leaves 0.5GB for the
                    127: so-called PCI hole, that allows a 32-bit PCI bus to exist in a system with
                    128: 4GB of memory.
                    129: 
                    130: The memory controller diverts addresses in the range 640K-768K to the PCI
                    131: address space.  This is modelled using the "vga-window" alias, mapped at a
                    132: higher priority so it obscures the RAM at the same addresses.  The vga window
                    133: can be removed by programming the memory controller; this is modelled by
                    134: removing the alias and exposing the RAM underneath.
                    135: 
                    136: The pci address space is not a direct child of the system address space, since
                    137: we only want parts of it to be visible (we accomplish this using aliases).
                    138: It has two subregions: vga-area models the legacy vga window and is occupied
                    139: by two 32K memory banks pointing at two sections of the framebuffer.
                    140: In addition the vram is mapped as a BAR at address e1000000, and an additional
                    141: BAR containing MMIO registers is mapped after it.
                    142: 
                    143: Note that if the guest maps a BAR outside the PCI hole, it would not be
                    144: visible as the pci-hole alias clips it to a 0.5GB range.
                    145: 
                    146: Attributes
                    147: ----------
                    148: 
                    149: Various region attributes (read-only, dirty logging, coalesced mmio, ioeventfd)
                    150: can be changed during the region lifecycle.  They take effect once the region
                    151: is made visible (which can be immediately, later, or never).
                    152: 
                    153: MMIO Operations
                    154: ---------------
                    155: 
                    156: MMIO regions are provided with ->read() and ->write() callbacks; in addition
                    157: various constraints can be supplied to control how these callbacks are called:
                    158: 
                    159:  - .valid.min_access_size, .valid.max_access_size define the access sizes
                    160:    (in bytes) which the device accepts; accesses outside this range will
                    161:    have device and bus specific behaviour (ignored, or machine check)
                    162:  - .valid.aligned specifies that the device only accepts naturally aligned
                    163:    accesses.  Unaligned accesses invoke device and bus specific behaviour.
                    164:  - .impl.min_access_size, .impl.max_access_size define the access sizes
                    165:    (in bytes) supported by the *implementation*; other access sizes will be
                    166:    emulated using the ones available.  For example a 4-byte write will be
                    167:    emulated using four 1-byte write, if .impl.max_access_size = 1.
                    168:  - .impl.valid specifies that the *implementation* only supports unaligned
                    169:    accesses; unaligned accesses will be emulated by two aligned accesses.
                    170:  - .old_portio and .old_mmio can be used to ease porting from code using
                    171:    cpu_register_io_memory() and register_ioport().  They should not be used
                    172:    in new code.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.