Annotation of GNUtools/emacs/etc/AIX.DUMP, revision 1.1.1.1

1.1       root        1: The following text was written by someone at IBM to describe an older
                      2: version of the code for dumping on AIX.
                      3: 
                      4: I (rms) couldn't understand the code, and I can't fully understand
                      5: this text either.  I rewrote the code to use the same basic
                      6: principles, as far as I understood them, but more cleanly.  This
                      7: rewritten code does not always work.  In fact, the basic method 
                      8: seems to be intrinsically flawed.
                      9: 
                     10: Since then, someone else implemented a different way of dumping on
                     11: the RS/6000, which does seem to work.  None of the following
                     12: applies to the way Emacs now dumps on the 6000.  However, the
                     13: current method fails to use shared libraries.  Anyone who might be
                     14: interested in trying to resurrect the previous method might still
                     15: find the following information useful.
                     16: 
                     17: 
                     18: It seems that the IBM dumping code was simply set up to detect when
                     19: the dumped data cannot be used, and in that case to act approximately
                     20: as if CANNOT_DUMP had been defined all along.  (This is buried in
                     21: paragraph 1.)  It seems simpler just to define CANNOT_DUMP, since
                     22: Emacs is not set up to decide at run time whether there is dumping or
                     23: not, and doing so correctly would be a lot of work.
                     24: 
                     25: Note that much of the other information, such as the name and format 
                     26: of the dumped data file, has been changed.
                     27: 
                     28: 
                     29:                --rms
                     30: 
                     31: 
                     32: 
                     33:         A different approach has been taken to implement the
                     34: "dump/load" feature of GNU Emacs for AIX 3.1.  Traditionally the
                     35: unexec function creates a new a.out executable file which contains
                     36: preloaded Lisp code.  Executing the new a.out file (normally called
                     37: xemacs) provides rapid startup since the standard suite of Lisp code
                     38: is preloaded as part of the executable file.
                     39: 
                     40:         AIX 3.1 architecture precludes the use of this technique
                     41: because the dynamic loader cannot guarantee a fixed starting location
                     42: for the process data section.  The loader loads all shared library
                     43: data BEFORE process data.  When a shared library changes its data
                     44: space, the process initial data section address (_data) will change
                     45: and all global process variables are automatically relocated to new
                     46: addresses.  This invalidates the "dumped" Emacs executable which has
                     47: data addresses which are not relocatable and now corrupt.  Emacs would
                     48: fail to execute until rebuilt with the new libraries.
                     49: 
                     50:         To circumvent the dynamic loader feature of AIX 3.1, the dump process
                     51: has been modified as follows:
                     52: 
                     53:         1) A new executable file is NOT created.  Instead, both pure and
                     54:            impure data are saved by the dump function and automatically
                     55:            reloaded during process initialization.  If any of the saved data
                     56:            is unavailable or invalid, loadup.el will be automatically loaded.
                     57: 
                     58:         2) Pure data is defined as a shared memory segment and attached
                     59:            automatically as read-only data during initialization.  This
                     60:            allows the pure data to be a shared resource amoung all Emacs
                     61:            processes.  The shared memory segment size is PURESIZE bytes.
                     62:            If the shared memory segment is unavailable or invalid, a new
                     63:            shared memory segment is created and the impure data save file
                     64:            is destroyed, forcing loadup.el to be reloaded.
                     65: 
                     66:         3) The ipc key used to create and access Emacs shared memory is
                     67:            SHMKEY and can be overridden by the environment symbol EMACSSHMKEY.
                     68:            Only one ipc key is allowed per system.  The environment symbol
                     69:            is provided in case the default ipc key has already been used.
                     70: 
                     71:         4) Impure data is written to the ../bin/.emacs.data file by the
                     72:            dump function.  This file contains the process' impure data
                     73:            at the moment of load completion.  During Emacs initialization,
                     74:            the process' data section is expanded and overwritten
                     75:            with the .emacs.data file contents.
                     76: 
                     77:         The following are software notes concerning the GNU Emacs dump function under AIX 3.1:
                     78: 
                     79:         1) All of the new dump/load code is activated by the #ifdef SHMKEY
                     80:            conditional.
                     81: 
                     82:         2) The automatic loading of loadup.el does NOT cause the dump function
                     83:            to be performed.  Therefore once the pure/impure data is discarded,
                     84:            someone must remake Emacs to create the saved data files.  This
                     85:            should only be necessary when Emacs is first installed or whenever
                     86:            AIX is upgraded.
                     87: 
                     88:         3) Emacs will exit with an error if executed in a non-X environment
                     89:            and the dump function was performed within a X window.  Therefore
                     90:            the dump function should always be performed in a non-X
                     91:            environment unless the X environment will ALWAYS be available.
                     92: 
                     93:         4) Emacs only maintains the lower 24 bits of any data address.  The
                     94:            remaining upper 8 bits are reset by the XPNTR macro whenever any
                     95:            Lisp object is referenced.  This poses a serious problem because
                     96:            pure data is stored in segment 3 (shared memory) and impure data
                     97:            is stored in segment 2 (data).  To reset the upper 8 address bits
                     98:            correctly, XPNTR must guess as to which type of data is represented
                     99:            by the lower 24 address bits.  The technique chosen is based upon
                    100:            the fact that pure data offsets in segment 3 range from
                    101:            0 -> PURESIZE-1, which are relatively small offsets.  Impure data
                    102:            offsets in segment 2 are relatively large (> 0x40000) because they
                    103:            must follow all shared library data.  Therefore XPNTR adds segment
                    104:            3 to each data offset which is small (below PURESIZE) and adds
                    105:            segment 2 to all other offsets.  This algorithm will remain valid
                    106:            as long as a) pure data size remains relatively small and b) process
                    107:            data is loaded after shared library data.
                    108: 
                    109:            To eliminate this guessing game, Emacs must preserve the 32-bit
                    110:            address and add additional data object overhead for the object type
                    111:            and garbage collection mark bit.
                    112: 
                    113:         5) The data section written to .emacs.data is divided into three
                    114:            areas as shown below.  The file header contains four character
                    115:            pointers which are used during automatic data loading.  The file's
                    116:            contents will only be used if the first three addresses match
                    117:            their counterparts in the current process.  The fourth address is
                    118:            the new data segment address required to hold all of the preloaded
                    119:            data.
                    120: 
                    121: 
                    122:                         .emacs.data file format
                    123: 
                    124:                 +---------------------------------------+ \
                    125:                 |     address of _data                  |  \
                    126:                 +---------------------------------------+   \
                    127:                 |     address of _end                   |    \
                    128:                 +---------------------------------------+      file header
                    129:                 |     address of initial sbrk(0)        |    /
                    130:                 +---------------------------------------+   /
                    131:                 |     address of final sbrk(0)          |  /
                    132:                 +---------------------------------------+ /
                    133:                 \                                       \
                    134:                 \                                       \
                    135:                       all data to be loaded from
                    136:                       _data to _end
                    137:                 \                                       \
                    138:                 \                                       \
                    139:                 +---------------------------------------+
                    140:                 \                                       \
                    141:                 \                                       \
                    142:                       all data to be loaded from
                    143:                       initial to final sbrk(0)
                    144:                 \                                       \
                    145:                 +---------------------------------------+
                    146: 
                    147: 
                    148:            Sections two and three contain the preloaded data which is
                    149:            resotred at locations _data and initial sbrk(0) respectively.
                    150: 
                    151:            The reason two separate sections are needed is that process
                    152:            initialization allocates data (via malloc) prior to main()
                    153:            being called.  Therefore _end is several kbytes lower than
                    154:            the address returned by an initial sbrk(0).  This creates a
                    155:            hole in the process data space and malloc will abort if this
                    156:            region is overwritten during the load function.
                    157: 
                    158:            One further complication with the malloc'd space is that it
                    159:            is partially empty and must be "consumed" so that data space
                    160:            malloc'd in the future is not assigned to this region.  The malloc
                    161:            function distributed with Emacs anticipates this problem but the
                    162:            AIX 3.1 version does not.  Therefore, repeated malloc calls are
                    163:            needed to exhaust this initial malloc space.  How do you know
                    164:            when malloc has exhausted its free memroy? You don't!  So the
                    165:            code must repeatedly call malloc for each buffer size and
                    166:            detect when a new memory page has been allocated.  Once the new
                    167:            memory page is allocated, you can calculate the number of free
                    168:            buffers in that page and request exactly that many more.  Future
                    169:            malloc requests will now be added at the top of a new memory page.
                    170: 
                    171:            One final point - the initial sbrk(0) is the value of sbrk(0)
                    172:            after all of the above malloc hacking has been performed.
                    173: 
                    174: 
                    175:         The following Emacs dump/load issues need to be addressed:
                    176: 
                    177:         1) Loadup.el exits with an error message because the xemacs and
                    178:            emacs-xxx files are not created during the dump function.
                    179: 
                    180:            Loadup.el should be changed to check for the new .emacs.data
                    181:            file.
                    182: 
                    183:         2) Dump will only support one .emacs.data file for the entire
                    184:            system.  This precludes the ability to allow each user to
                    185:            define his/her own "dumped" Emacs.
                    186: 
                    187:            Add an environment symbol to override the default .emacs.data
                    188:            path.
                    189: 
                    190:         3) An error message "error in init file" is displayed out of
                    191:            startup.el when the dumped Emacs is invoked by a non-root user.
                    192:            Although all of the preloaded Lisp code is present, the important
                    193:            purify-flag has not been set back to Qnil - precluding the
                    194:            loading of any further Lisp code until the flag is manually
                    195:            reset.
                    196: 
                    197:            The problem appears to be an access violation which will go
                    198:            away if the read-write access modes to all of the files are
                    199:            changed to rw-.
                    200: 
                    201:         4) In general, all file access modes should be changed from
                    202:            rw-r--r-- to rw-rw-rw-.  They are currently setup to match
                    203:            standard AIX access modes.
                    204: 
                    205:         5) The dump function is not invoked when the automatic load of
                    206:            loadup.el is performed.
                    207: 
                    208:            Perhaps the command arguments array should be expanded with
                    209:            "dump" added to force an automatic dump.
                    210: 
                    211:         6) The automatic initialization function alloc_shm will delete
                    212:            the shared memory segment and .emacs.data file if the "dump"
                    213:            command argument is found in ANY argument position.  The
                    214:            dump function will only take place in loadup.el if "dump"
                    215:            is the third or fourth command argument.
                    216: 
                    217:            Change alloc_shm to live by loadup.el rules.
                    218: 

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.