|
|
1.1 ! root 1: The memory API ! 2: ============== ! 3: ! 4: The memory API models the memory and I/O buses and controllers of a QEMU ! 5: machine. It attempts to allow modelling of: ! 6: ! 7: - ordinary RAM ! 8: - memory-mapped I/O (MMIO) ! 9: - memory controllers that can dynamically reroute physical memory regions ! 10: to different destinations ! 11: ! 12: The memory model provides support for ! 13: ! 14: - tracking RAM changes by the guest ! 15: - setting up coalesced memory for kvm ! 16: - setting up ioeventfd regions for kvm ! 17: ! 18: Memory is modelled as a tree (really acyclic graph) of MemoryRegion objects. ! 19: The root of the tree is memory as seen from the CPU's viewpoint (the system ! 20: bus). Nodes in the tree represent other buses, memory controllers, and ! 21: memory regions that have been rerouted. Leaves are RAM and MMIO regions. ! 22: ! 23: Types of regions ! 24: ---------------- ! 25: ! 26: There are four types of memory regions (all represented by a single C type ! 27: MemoryRegion): ! 28: ! 29: - RAM: a RAM region is simply a range of host memory that can be made available ! 30: to the guest. ! 31: ! 32: - MMIO: a range of guest memory that is implemented by host callbacks; ! 33: each read or write causes a callback to be called on the host. ! 34: ! 35: - container: a container simply includes other memory regions, each at ! 36: a different offset. Containers are useful for grouping several regions ! 37: into one unit. For example, a PCI BAR may be composed of a RAM region ! 38: and an MMIO region. ! 39: ! 40: A container's subregions are usually non-overlapping. In some cases it is ! 41: useful to have overlapping regions; for example a memory controller that ! 42: can overlay a subregion of RAM with MMIO or ROM, or a PCI controller ! 43: that does not prevent card from claiming overlapping BARs. ! 44: ! 45: - alias: a subsection of another region. Aliases allow a region to be ! 46: split apart into discontiguous regions. Examples of uses are memory banks ! 47: used when the guest address space is smaller than the amount of RAM ! 48: addressed, or a memory controller that splits main memory to expose a "PCI ! 49: hole". Aliases may point to any type of region, including other aliases, ! 50: but an alias may not point back to itself, directly or indirectly. ! 51: ! 52: ! 53: Region names ! 54: ------------ ! 55: ! 56: Regions are assigned names by the constructor. For most regions these are ! 57: only used for debugging purposes, but RAM regions also use the name to identify ! 58: live migration sections. This means that RAM region names need to have ABI ! 59: stability. ! 60: ! 61: Region lifecycle ! 62: ---------------- ! 63: ! 64: A region is created by one of the constructor functions (memory_region_init*()) ! 65: and destroyed by the destructor (memory_region_destroy()). In between, ! 66: a region can be added to an address space by using memory_region_add_subregion() ! 67: and removed using memory_region_del_subregion(). Region attributes may be ! 68: changed at any point; they take effect once the region becomes exposed to the ! 69: guest. ! 70: ! 71: Overlapping regions and priority ! 72: -------------------------------- ! 73: Usually, regions may not overlap each other; a memory address decodes into ! 74: exactly one target. In some cases it is useful to allow regions to overlap, ! 75: and sometimes to control which of an overlapping regions is visible to the ! 76: guest. This is done with memory_region_add_subregion_overlap(), which ! 77: allows the region to overlap any other region in the same container, and ! 78: specifies a priority that allows the core to decide which of two regions at ! 79: the same address are visible (highest wins). ! 80: ! 81: Visibility ! 82: ---------- ! 83: The memory core uses the following rules to select a memory region when the ! 84: guest accesses an address: ! 85: ! 86: - all direct subregions of the root region are matched against the address, in ! 87: descending priority order ! 88: - if the address lies outside the region offset/size, the subregion is ! 89: discarded ! 90: - if the subregion is a leaf (RAM or MMIO), the search terminates ! 91: - if the subregion is a container, the same algorithm is used within the ! 92: subregion (after the address is adjusted by the subregion offset) ! 93: - if the subregion is an alias, the search is continues at the alias target ! 94: (after the address is adjusted by the subregion offset and alias offset) ! 95: ! 96: Example memory map ! 97: ------------------ ! 98: ! 99: system_memory: container@0-2^48-1 ! 100: | ! 101: +---- lomem: alias@0-0xdfffffff ---> #ram (0-0xdfffffff) ! 102: | ! 103: +---- himem: alias@0x100000000-0x11fffffff ---> #ram (0xe0000000-0xffffffff) ! 104: | ! 105: +---- vga-window: alias@0xa0000-0xbfffff ---> #pci (0xa0000-0xbffff) ! 106: | (prio 1) ! 107: | ! 108: +---- pci-hole: alias@0xe0000000-0xffffffff ---> #pci (0xe0000000-0xffffffff) ! 109: ! 110: pci (0-2^32-1) ! 111: | ! 112: +--- vga-area: container@0xa0000-0xbffff ! 113: | | ! 114: | +--- alias@0x00000-0x7fff ---> #vram (0x010000-0x017fff) ! 115: | | ! 116: | +--- alias@0x08000-0xffff ---> #vram (0x020000-0x027fff) ! 117: | ! 118: +---- vram: ram@0xe1000000-0xe1ffffff ! 119: | ! 120: +---- vga-mmio: mmio@0xe2000000-0xe200ffff ! 121: ! 122: ram: ram@0x00000000-0xffffffff ! 123: ! 124: The is a (simplified) PC memory map. The 4GB RAM block is mapped into the ! 125: system address space via two aliases: "lomem" is a 1:1 mapping of the first ! 126: 3.5GB; "himem" maps the last 0.5GB at address 4GB. This leaves 0.5GB for the ! 127: so-called PCI hole, that allows a 32-bit PCI bus to exist in a system with ! 128: 4GB of memory. ! 129: ! 130: The memory controller diverts addresses in the range 640K-768K to the PCI ! 131: address space. This is modelled using the "vga-window" alias, mapped at a ! 132: higher priority so it obscures the RAM at the same addresses. The vga window ! 133: can be removed by programming the memory controller; this is modelled by ! 134: removing the alias and exposing the RAM underneath. ! 135: ! 136: The pci address space is not a direct child of the system address space, since ! 137: we only want parts of it to be visible (we accomplish this using aliases). ! 138: It has two subregions: vga-area models the legacy vga window and is occupied ! 139: by two 32K memory banks pointing at two sections of the framebuffer. ! 140: In addition the vram is mapped as a BAR at address e1000000, and an additional ! 141: BAR containing MMIO registers is mapped after it. ! 142: ! 143: Note that if the guest maps a BAR outside the PCI hole, it would not be ! 144: visible as the pci-hole alias clips it to a 0.5GB range. ! 145: ! 146: Attributes ! 147: ---------- ! 148: ! 149: Various region attributes (read-only, dirty logging, coalesced mmio, ioeventfd) ! 150: can be changed during the region lifecycle. They take effect once the region ! 151: is made visible (which can be immediately, later, or never). ! 152: ! 153: MMIO Operations ! 154: --------------- ! 155: ! 156: MMIO regions are provided with ->read() and ->write() callbacks; in addition ! 157: various constraints can be supplied to control how these callbacks are called: ! 158: ! 159: - .valid.min_access_size, .valid.max_access_size define the access sizes ! 160: (in bytes) which the device accepts; accesses outside this range will ! 161: have device and bus specific behaviour (ignored, or machine check) ! 162: - .valid.aligned specifies that the device only accepts naturally aligned ! 163: accesses. Unaligned accesses invoke device and bus specific behaviour. ! 164: - .impl.min_access_size, .impl.max_access_size define the access sizes ! 165: (in bytes) supported by the *implementation*; other access sizes will be ! 166: emulated using the ones available. For example a 4-byte write will be ! 167: emulated using four 1-byte write, if .impl.max_access_size = 1. ! 168: - .impl.valid specifies that the *implementation* only supports unaligned ! 169: accesses; unaligned accesses will be emulated by two aligned accesses. ! 170: - .old_portio and .old_mmio can be used to ease porting from code using ! 171: cpu_register_io_memory() and register_ioport(). They should not be used ! 172: in new code.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.