Annotation of mstools/mfc/doc/tn002.txt, revision 1.1.1.1

1.1       root        1: Microsoft Foundation Classes                            Microsoft Corporation
                      2: Technical Notes            
                      3: 
                      4: #2 : Persistent Data Format
                      5: 
                      6: This note describes the MFC routines that support persistent
                      7: C++ objects and the format of those objects in a persistent store.
                      8: 
                      9: =============================================================================
                     10: 
                     11: The Problem
                     12: ===========
                     13: 
                     14: The MFC implementation for persistent data relies on a 
                     15: compact binary format for saving data to disk.  This format
                     16: is distinct from the format used for diagnostic output
                     17: of class objects for two reasons: (1) diagnostic output is
                     18: human readable, and (2) maximum space efficiency is desired
                     19: when saving to a persistent store (usually a disk).  It is
                     20: for these reasons, that MFC does not provide a polymorphic
                     21: interface for storing objects, as is common in other "pure"
                     22: object-oriented languages, such as Smalltalk-80.
                     23: 
                     24: MFC solves this problem by using the class CArchive.  A
                     25: CArchive class object provides a context for persistence that
                     26: lasts from the time the archive is created until the CArchive::Close
                     27: member function is called, either explicitly by the programmer, or
                     28: implicitly by the destructor when the scope containing the CArchive
                     29: is exited.
                     30: 
                     31: This note describes the implementation of the CArchive protected 
                     32: members WriteObject and ReadObject.  ReadObject and WriteObject
                     33: are never called directly by end users; these member functions
                     34: are used to implement persistent objects.  Remember that end users
                     35: should use the class-specific type safe insertion and extraction
                     36: operators, enabled by including the DECLARE_SERIAL and IMPLEMENT_SERIAL
                     37: macros in a CObject-derived class.  Similarly, the end user rarely
                     38: calls the virtual member function CObject::Serialize directly, unless the
                     39: object being stored is embedded in another class object, in which case
                     40: the exact type of the object is known.
                     41: 
                     42: 
                     43: NOTE: This note describes code located in the MFC
                     44: source file ARCHIVE.CPP.
                     45: 
                     46: =============================================================================
                     47: Saving objects to the store (CArchive::WriteObject)
                     48: ===================================================
                     49: 
                     50: The protected member function void CArchive::WriteObject(const CObject*)
                     51: is responsible for writing out enough data so that the object
                     52: can be correctly reconstructed.  This data consists of two parts:
                     53: the type of the object and the state of the object.  This member
                     54: function is also responsible for maintaining the identity of the
                     55: object being written out, so that only a single copy is 
                     56: saved, regardless of the number of pointers to that object 
                     57: (including circular pointers).
                     58: 
                     59: The saving (inserting) and restoring (extracting) of objects 
                     60: relies on several manifest constants.  These are values that 
                     61: are stored in binary and provide important information to the 
                     62: archive (note the "w" prefix indicates 16-bit quantities).
                     63: 
                     64:     wNullTag        // used for NULL object pointers (0)
                     65:     wNewClassTag    // indicates class description that follows is new
                     66:                     // to this archive context (-1)
                     67:     wOldClassTag    // indicates class of the object being read
                     68:                     // has been seen in this context (0x8000)
                     69: 
                     70: When storing objects, the archive maintains a CMapPtrToWord
                     71: (the m_pStoreMap) which is a mapping from a stored object to a
                     72: 16-bit persistent identifier (PID).  A PID is assigned to every
                     73: unique object and every unique class name that is saved in
                     74: the context of the archive.  These PIDs are handed out sequentially
                     75: starting at 1.  It is important to note, that these PIDs have
                     76: no significance outside the scope of the archive, and in
                     77: particular are not to be confused with the "record number" or
                     78: other identity concepts.
                     79: 
                     80: When a request is made to save an object to an archive
                     81: (usually through the global insertion operator), a check is made
                     82: for a NULL CObject pointer; if the pointer is NULL the wNullTag is
                     83: inserted into the archive stream.
                     84: 
                     85: If we have a real object pointer that is capable of being
                     86: serialized (the class is a DECLARE_SERIAL class), we then check
                     87: the m_pStoreMap to see if the object has been saved already, and if
                     88: that is the case we insert the 16-bit PID associated with that
                     89: object.  
                     90: 
                     91: If the object has not been saved before, there are two possibilities
                     92: we must take into account, either both the object and the
                     93: exact type (i.e. class) of the object are new to this archive context,
                     94: or the object is of an exact type already seen.  To determine
                     95: if the type has been seen we query the m_pStoreMap for a CRuntimeClass
                     96: object (formally, CRuntimeClass is a structure to avoid problems
                     97: associated with meta-classes) that matches the CRuntimeClass 
                     98: object associated with the object we are saving.  If we have seen this
                     99: class before then WriteObject inserts out a 16-bit tag that is the
                    100: bit-wise OR'ing of wOldClassTag and this index.  You will note
                    101: that this operation imposes a hard limit of 32766 indices per
                    102: archive context.  This number represents the maximum number of
                    103: unique objects and classes that can be saved in a single archive,
                    104: but note that a single disk file can have an unlimited number
                    105: of archive contexts.  If the CRuntimeClass is new to this archive
                    106: context, then WriteObject will assign a new PID to that class
                    107: and insert it into the archive, preceded by the wNewClassTag value.
                    108: The descriptor for this class is then inserted into the archive
                    109: using the CRuntimeClass member function Store.  CRuntimeClass::Store
                    110: inserts the schema number of the class (see below) and the
                    111: ASCII text name of the class.  Note that the use of the ASCII
                    112: text name does not guarantee uniqueness of the archive across
                    113: applications, thus it is advisable to tag your data files to
                    114: prevent corruption (imagine distinct applications that both
                    115: define the class CWordStack, for example).  Following the
                    116: insertion of the class information, the archive places the
                    117: object into the m_pStoreMap and then calls the Serialize member
                    118: function to insert class-specific data into the archive.  Placing
                    119: the object into the m_pStoreMap before calling Serialize prevents
                    120: multiple copies of the object from being saved to the store.
                    121: 
                    122: When returning to the initial caller (usually the root of the
                    123: network of objects), it is important to Close the archive.
                    124: If other CFile operations are going to be done, the CArchive
                    125: member function Flush MUST be called.  Failure to do so will
                    126: result in a corrupt archive.
                    127: 
                    128: 
                    129: =============================================================================
                    130: Loading objects from the store (CArchive::ReadObject)
                    131: =====================================================
                    132: 
                    133: Loading (extracting) objects uses the protected
                    134: CArchive::ReadObject function, and is the converse of WriteObject.
                    135: As with WriteObject, ReadObject is not called directly by user code;
                    136: user code should call the type-safe extraction operator (enabled by
                    137: DECLARE_SERIAL/IMPLEMENT_SERIAL), which then calls ReadObject.
                    138: This extraction operator will insure the type integrity of the extract
                    139: operation.
                    140: 
                    141: Since the WriteObject implementation discussed above assigned
                    142: increasing PIDs, starting with 1 (0 is predefined as the NULL object),
                    143: the ReadObject implementation can use an array to maintain
                    144: the state of the archive context.  When a PID is read from
                    145: the store, if the PID is greater than the current upper
                    146: bound of the m_pLoadArray, then ReadObject knows that a
                    147: "new" object (or class description) follows.
                    148: 
                    149: 
                    150: =============================================================================
                    151: Schema numbers
                    152: ==============
                    153: 
                    154: The schema number, which is assigned to the class when the class'
                    155: IMPLEMENT_SERIAL is encountered, is the "version" of the
                    156: class implementation.  The schema refers to the implementation
                    157: of the class, not to the number of times a given object has been
                    158: made persistent.  Properly, the latter is usually referred to as the
                    159: object version.  If you intend to maintain several different
                    160: implementations of the same class over time, incrementing the schema
                    161: as you revise your object's Serialize member function implementation
                    162: will enable you to write code that can load objects stored using older
                    163: iterations of the implementation.
                    164: 
                    165: The CArchive::ReadObject member function will throw a CArchiveException
                    166: when it encounters a schema number in the persistent store that differs
                    167: from the schema number of the class description in memory.  If your
                    168: implementation of Serialize for a class with multiple schemas catches this
                    169: exception, you will be able to continue the extraction operation taking
                    170: into account the differences in the implementation of the Serialize
                    171: member function.
                    172: 
                    173: 
                    174: =============================================================================
                    175: CRuntimeClass
                    176: =============
                    177: 
                    178: The persistence mechanism uses the CRuntimeClass data
                    179: structure to uniquely identify classes.  MFC associates one
                    180: structure of this type with each dynamic and/or serializable class in
                    181: the application.  These structures are initialized at application
                    182: startup time using a special static object of type CClassInit.  You
                    183: need not concern yourself with the implementation of this information,
                    184: as it is likely to change between revisions of MFC.
                    185: 
                    186: The current implementation of CRuntimeClass does not support
                    187: multiple inheritance (MI).  This does not mean you cannot use MI
                    188: in your MFC application, but it does imply that you will have
                    189: certain responsibilities when working with objects that have more than
                    190: one base class.  The CObject::IsKindOf member function
                    191: will not correctly determine the type of an object if it
                    192: has multiple base classes.  Therefore, you cannot use CObject
                    193: as a virtual base class, and all calls to CObject member functions
                    194: such as Serialize and operator new will need to have scope qualifiers
                    195: so that C++ can disambiguate the function call.  If you do find
                    196: the need to use MI within MFC, then you should be sure to make the
                    197: class containing the CObject base class the leftmost class in the
                    198: list of base classes.
                    199: 
                    200: For advice on the uses and abuses of MI, a good reference is
                    201: "Advanced C++ Programming Styles and Idioms" by James O. Coplien
                    202: (Addison Wesley, 1992).

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.