Annotation of qemu/docs/specs/qcow2.txt, revision 1.1.1.1

1.1       root        1: == General ==
                      2: 
                      3: A qcow2 image file is organized in units of constant size, which are called
                      4: (host) clusters. A cluster is the unit in which all allocations are done,
                      5: both for actual guest data and for image metadata.
                      6: 
                      7: Likewise, the virtual disk as seen by the guest is divided into (guest)
                      8: clusters of the same size.
                      9: 
                     10: All numbers in qcow2 are stored in Big Endian byte order.
                     11: 
                     12: 
                     13: == Header ==
                     14: 
                     15: The first cluster of a qcow2 image contains the file header:
                     16: 
                     17:     Byte  0 -  3:   magic
                     18:                     QCOW magic string ("QFI\xfb")
                     19: 
                     20:           4 -  7:   version
                     21:                     Version number (only valid value is 2)
                     22: 
                     23:           8 - 15:   backing_file_offset
                     24:                     Offset into the image file at which the backing file name
                     25:                     is stored (NB: The string is not null terminated). 0 if the
                     26:                     image doesn't have a backing file.
                     27: 
                     28:          16 - 19:   backing_file_size
                     29:                     Length of the backing file name in bytes. Must not be
                     30:                     longer than 1023 bytes. Undefined if the image doesn't have
                     31:                     a backing file.
                     32: 
                     33:          20 - 23:   cluster_bits
                     34:                     Number of bits that are used for addressing an offset
                     35:                     within a cluster (1 << cluster_bits is the cluster size).
                     36:                     Must not be less than 9 (i.e. 512 byte clusters).
                     37: 
                     38:                     Note: qemu as of today has an implementation limit of 2 MB
                     39:                     as the maximum cluster size and won't be able to open images
                     40:                     with larger cluster sizes.
                     41: 
                     42:          24 - 31:   size
                     43:                     Virtual disk size in bytes
                     44: 
                     45:          32 - 35:   crypt_method
                     46:                     0 for no encryption
                     47:                     1 for AES encryption
                     48: 
                     49:          36 - 39:   l1_size
                     50:                     Number of entries in the active L1 table
                     51: 
                     52:          40 - 47:   l1_table_offset
                     53:                     Offset into the image file at which the active L1 table
                     54:                     starts. Must be aligned to a cluster boundary.
                     55: 
                     56:          48 - 55:   refcount_table_offset
                     57:                     Offset into the image file at which the refcount table
                     58:                     starts. Must be aligned to a cluster boundary.
                     59: 
                     60:          56 - 59:   refcount_table_clusters
                     61:                     Number of clusters that the refcount table occupies
                     62: 
                     63:          60 - 63:   nb_snapshots
                     64:                     Number of snapshots contained in the image
                     65: 
                     66:          64 - 71:   snapshots_offset
                     67:                     Offset into the image file at which the snapshot table
                     68:                     starts. Must be aligned to a cluster boundary.
                     69: 
                     70: Directly after the image header, optional sections called header extensions can
                     71: be stored. Each extension has a structure like the following:
                     72: 
                     73:     Byte  0 -  3:   Header extension type:
                     74:                         0x00000000 - End of the header extension area
                     75:                         0xE2792ACA - Backing file format name
                     76:                         other      - Unknown header extension, can be safely
                     77:                                      ignored
                     78: 
                     79:           4 -  7:   Length of the header extension data
                     80: 
                     81:           8 -  n:   Header extension data
                     82: 
                     83:           n -  m:   Padding to round up the header extension size to the next
                     84:                     multiple of 8.
                     85: 
                     86: The remaining space between the end of the header extension area and the end of
                     87: the first cluster can be used for other data. Usually, the backing file name is
                     88: stored there.
                     89: 
                     90: 
                     91: == Host cluster management ==
                     92: 
                     93: qcow2 manages the allocation of host clusters by maintaining a reference count
                     94: for each host cluster. A refcount of 0 means that the cluster is free, 1 means
                     95: that it is used, and >= 2 means that it is used and any write access must
                     96: perform a COW (copy on write) operation.
                     97: 
                     98: The refcounts are managed in a two-level table. The first level is called
                     99: refcount table and has a variable size (which is stored in the header). The
                    100: refcount table can cover multiple clusters, however it needs to be contiguous
                    101: in the image file.
                    102: 
                    103: It contains pointers to the second level structures which are called refcount
                    104: blocks and are exactly one cluster in size.
                    105: 
                    106: Given a offset into the image file, the refcount of its cluster can be obtained
                    107: as follows:
                    108: 
                    109:     refcount_block_entries = (cluster_size / sizeof(uint16_t))
                    110: 
                    111:     refcount_block_index = (offset / cluster_size) % refcount_table_entries
                    112:     refcount_table_index = (offset / cluster_size) / refcount_table_entries
                    113: 
                    114:     refcount_block = load_cluster(refcount_table[refcount_table_index]);
                    115:     return refcount_block[refcount_block_index];
                    116: 
                    117: Refcount table entry:
                    118: 
                    119:     Bit  0 -  8:    Reserved (set to 0)
                    120: 
                    121:          9 - 63:    Bits 9-63 of the offset into the image file at which the
                    122:                     refcount block starts. Must be aligned to a cluster
                    123:                     boundary.
                    124: 
                    125:                     If this is 0, the corresponding refcount block has not yet
                    126:                     been allocated. All refcounts managed by this refcount block
                    127:                     are 0.
                    128: 
                    129: Refcount block entry:
                    130: 
                    131:     Bit  0 - 15:    Reference count of the cluster
                    132: 
                    133: 
                    134: == Cluster mapping ==
                    135: 
                    136: Just as for refcounts, qcow2 uses a two-level structure for the mapping of
                    137: guest clusters to host clusters. They are called L1 and L2 table.
                    138: 
                    139: The L1 table has a variable size (stored in the header) and may use multiple
                    140: clusters, however it must be contiguous in the image file. L2 tables are
                    141: exactly one cluster in size.
                    142: 
                    143: Given a offset into the virtual disk, the offset into the image file can be
                    144: obtained as follows:
                    145: 
                    146:     l2_entries = (cluster_size / sizeof(uint64_t))
                    147: 
                    148:     l2_index = (offset / cluster_size) % l2_entries
                    149:     l1_index = (offset / cluster_size) / l2_entries
                    150: 
                    151:     l2_table = load_cluster(l1_table[l1_index]);
                    152:     cluster_offset = l2_table[l2_index];
                    153: 
                    154:     return cluster_offset + (offset % cluster_size)
                    155: 
                    156: L1 table entry:
                    157: 
                    158:     Bit  0 -  8:    Reserved (set to 0)
                    159: 
                    160:          9 - 55:    Bits 9-55 of the offset into the image file at which the L2
                    161:                     table starts. Must be aligned to a cluster boundary. If the
                    162:                     offset is 0, the L2 table and all clusters described by this
                    163:                     L2 table are unallocated.
                    164: 
                    165:         56 - 62:    Reserved (set to 0)
                    166: 
                    167:              63:    0 for an L2 table that is unused or requires COW, 1 if its
                    168:                     refcount is exactly one. This information is only accurate
                    169:                     in the active L1 table.
                    170: 
                    171: L2 table entry (for normal clusters):
                    172: 
                    173:     Bit  0 -  8:    Reserved (set to 0)
                    174: 
                    175:          9 - 55:    Bits 9-55 of host cluster offset. Must be aligned to a
                    176:                     cluster boundary. If the offset is 0, the cluster is
                    177:                     unallocated.
                    178: 
                    179:         56 - 61:    Reserved (set to 0)
                    180: 
                    181:              62:    0 (this cluster is not compressed)
                    182: 
                    183:              63:    0 for a cluster that is unused or requires COW, 1 if its
                    184:                     refcount is exactly one. This information is only accurate
                    185:                     in L2 tables that are reachable from the the active L1
                    186:                     table.
                    187: 
                    188: L2 table entry (for compressed clusters; x = 62 - (cluster_size - 8)):
                    189: 
                    190:     Bit  0 -  x:    Host cluster offset. This is usually _not_ aligned to a
                    191:                     cluster boundary!
                    192: 
                    193:        x+1 - 61:    Compressed size of the images in sectors of 512 bytes
                    194: 
                    195:              62:    1 (this cluster is compressed using zlib)
                    196: 
                    197:              63:    0 for a cluster that is unused or requires COW, 1 if its
                    198:                     refcount is exactly one. This information is only accurate
                    199:                     in L2 tables that are reachable from the the active L1
                    200:                     table.
                    201: 
                    202: If a cluster is unallocated, read requests shall read the data from the backing
                    203: file. If there is no backing file or the backing file is smaller than the image,
                    204: they shall read zeros for all parts that are not covered by the backing file.
                    205: 
                    206: 
                    207: == Snapshots ==
                    208: 
                    209: qcow2 supports internal snapshots. Their basic principle of operation is to
                    210: switch the active L1 table, so that a different set of host clusters are
                    211: exposed to the guest.
                    212: 
                    213: When creating a snapshot, the L1 table should be copied and the refcount of all
                    214: L2 tables and clusters reachable form this L1 table must be increased, so that
                    215: a write causes a COW and isn't visible in other snapshots.
                    216: 
                    217: When loading a snapshot, bit 63 of all entries in the new active L1 table and
                    218: all L2 tables referenced by it must be reconstructed from the refcount table
                    219: as it doesn't need to be accurate in inactive L1 tables.
                    220: 
                    221: A directory of all snapshots is stored in the snapshot table, a contiguous area
                    222: in the image file, whose starting offset and length are given by the header
                    223: fields snapshots_offset and nb_snapshots. The entries of the snapshot table
                    224: have variable length, depending on the length of ID, name and extra data.
                    225: 
                    226: Snapshot table entry:
                    227: 
                    228:     Byte 0 -  7:    Offset into the image file at which the L1 table for the
                    229:                     snapshot starts. Must be aligned to a cluster boundary.
                    230: 
                    231:          8 - 11:    Number of entries in the L1 table of the snapshots
                    232: 
                    233:         12 - 13:    Length of the unique ID string describing the snapshot
                    234: 
                    235:         14 - 15:    Length of the name of the snapshot
                    236: 
                    237:         16 - 19:    Time at which the snapshot was taken in seconds since the
                    238:                     Epoch
                    239: 
                    240:         20 - 23:    Subsecond part of the time at which the snapshot was taken
                    241:                     in nanoseconds
                    242: 
                    243:         24 - 31:    Time that the guest was running until the snapshot was
                    244:                     taken in nanoseconds
                    245: 
                    246:         32 - 35:    Size of the VM state in bytes. 0 if no VM state is saved.
                    247:                     If there is VM state, it starts at the first cluster
                    248:                     described by first L1 table entry that doesn't describe a
                    249:                     regular guest cluster (i.e. VM state is stored like guest
                    250:                     disk content, except that it is stored at offsets that are
                    251:                     larger than the virtual disk presented to the guest)
                    252: 
                    253:         36 - 39:    Size of extra data in the table entry (used for future
                    254:                     extensions of the format)
                    255: 
                    256:         variable:   Extra data for future extensions. Must be ignored.
                    257: 
                    258:         variable:   Unique ID string for the snapshot (not null terminated)
                    259: 
                    260:         variable:   Name of the snapshot (not null terminated)

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.