|
|
1.1 root 1: The following text was written by someone at IBM to describe an older
2: version of the code for dumping on AIX.
3:
4: I (rms) couldn't understand the code, and I can't fully understand
5: this text either. I rewrote the code to use the same basic
6: principles, as far as I understood them, but more cleanly. This
7: rewritten code does not always work. In fact, the basic method
8: seems to be intrinsically flawed.
9:
10: Since then, someone else implemented a different way of dumping on
11: the RS/6000, which does seem to work. None of the following
12: applies to the way Emacs now dumps on the 6000. However, the
13: current method fails to use shared libraries. Anyone who might be
14: interested in trying to resurrect the previous method might still
15: find the following information useful.
16:
17:
18: It seems that the IBM dumping code was simply set up to detect when
19: the dumped data cannot be used, and in that case to act approximately
20: as if CANNOT_DUMP had been defined all along. (This is buried in
21: paragraph 1.) It seems simpler just to define CANNOT_DUMP, since
22: Emacs is not set up to decide at run time whether there is dumping or
23: not, and doing so correctly would be a lot of work.
24:
25: Note that much of the other information, such as the name and format
26: of the dumped data file, has been changed.
27:
28:
29: --rms
30:
31:
32:
33: A different approach has been taken to implement the
34: "dump/load" feature of GNU Emacs for AIX 3.1. Traditionally the
35: unexec function creates a new a.out executable file which contains
36: preloaded Lisp code. Executing the new a.out file (normally called
37: xemacs) provides rapid startup since the standard suite of Lisp code
38: is preloaded as part of the executable file.
39:
40: AIX 3.1 architecture precludes the use of this technique
41: because the dynamic loader cannot guarantee a fixed starting location
42: for the process data section. The loader loads all shared library
43: data BEFORE process data. When a shared library changes its data
44: space, the process initial data section address (_data) will change
45: and all global process variables are automatically relocated to new
46: addresses. This invalidates the "dumped" Emacs executable which has
47: data addresses which are not relocatable and now corrupt. Emacs would
48: fail to execute until rebuilt with the new libraries.
49:
50: To circumvent the dynamic loader feature of AIX 3.1, the dump process
51: has been modified as follows:
52:
53: 1) A new executable file is NOT created. Instead, both pure and
54: impure data are saved by the dump function and automatically
55: reloaded during process initialization. If any of the saved data
56: is unavailable or invalid, loadup.el will be automatically loaded.
57:
58: 2) Pure data is defined as a shared memory segment and attached
59: automatically as read-only data during initialization. This
60: allows the pure data to be a shared resource amoung all Emacs
61: processes. The shared memory segment size is PURESIZE bytes.
62: If the shared memory segment is unavailable or invalid, a new
63: shared memory segment is created and the impure data save file
64: is destroyed, forcing loadup.el to be reloaded.
65:
66: 3) The ipc key used to create and access Emacs shared memory is
67: SHMKEY and can be overridden by the environment symbol EMACSSHMKEY.
68: Only one ipc key is allowed per system. The environment symbol
69: is provided in case the default ipc key has already been used.
70:
71: 4) Impure data is written to the ../bin/.emacs.data file by the
72: dump function. This file contains the process' impure data
73: at the moment of load completion. During Emacs initialization,
74: the process' data section is expanded and overwritten
75: with the .emacs.data file contents.
76:
77: The following are software notes concerning the GNU Emacs dump function under AIX 3.1:
78:
79: 1) All of the new dump/load code is activated by the #ifdef SHMKEY
80: conditional.
81:
82: 2) The automatic loading of loadup.el does NOT cause the dump function
83: to be performed. Therefore once the pure/impure data is discarded,
84: someone must remake Emacs to create the saved data files. This
85: should only be necessary when Emacs is first installed or whenever
86: AIX is upgraded.
87:
88: 3) Emacs will exit with an error if executed in a non-X environment
89: and the dump function was performed within a X window. Therefore
90: the dump function should always be performed in a non-X
91: environment unless the X environment will ALWAYS be available.
92:
93: 4) Emacs only maintains the lower 24 bits of any data address. The
94: remaining upper 8 bits are reset by the XPNTR macro whenever any
95: Lisp object is referenced. This poses a serious problem because
96: pure data is stored in segment 3 (shared memory) and impure data
97: is stored in segment 2 (data). To reset the upper 8 address bits
98: correctly, XPNTR must guess as to which type of data is represented
99: by the lower 24 address bits. The technique chosen is based upon
100: the fact that pure data offsets in segment 3 range from
101: 0 -> PURESIZE-1, which are relatively small offsets. Impure data
102: offsets in segment 2 are relatively large (> 0x40000) because they
103: must follow all shared library data. Therefore XPNTR adds segment
104: 3 to each data offset which is small (below PURESIZE) and adds
105: segment 2 to all other offsets. This algorithm will remain valid
106: as long as a) pure data size remains relatively small and b) process
107: data is loaded after shared library data.
108:
109: To eliminate this guessing game, Emacs must preserve the 32-bit
110: address and add additional data object overhead for the object type
111: and garbage collection mark bit.
112:
113: 5) The data section written to .emacs.data is divided into three
114: areas as shown below. The file header contains four character
115: pointers which are used during automatic data loading. The file's
116: contents will only be used if the first three addresses match
117: their counterparts in the current process. The fourth address is
118: the new data segment address required to hold all of the preloaded
119: data.
120:
121:
122: .emacs.data file format
123:
124: +---------------------------------------+ \
125: | address of _data | \
126: +---------------------------------------+ \
127: | address of _end | \
128: +---------------------------------------+ file header
129: | address of initial sbrk(0) | /
130: +---------------------------------------+ /
131: | address of final sbrk(0) | /
132: +---------------------------------------+ /
133: \ \
134: \ \
135: all data to be loaded from
136: _data to _end
137: \ \
138: \ \
139: +---------------------------------------+
140: \ \
141: \ \
142: all data to be loaded from
143: initial to final sbrk(0)
144: \ \
145: +---------------------------------------+
146:
147:
148: Sections two and three contain the preloaded data which is
149: resotred at locations _data and initial sbrk(0) respectively.
150:
151: The reason two separate sections are needed is that process
152: initialization allocates data (via malloc) prior to main()
153: being called. Therefore _end is several kbytes lower than
154: the address returned by an initial sbrk(0). This creates a
155: hole in the process data space and malloc will abort if this
156: region is overwritten during the load function.
157:
158: One further complication with the malloc'd space is that it
159: is partially empty and must be "consumed" so that data space
160: malloc'd in the future is not assigned to this region. The malloc
161: function distributed with Emacs anticipates this problem but the
162: AIX 3.1 version does not. Therefore, repeated malloc calls are
163: needed to exhaust this initial malloc space. How do you know
164: when malloc has exhausted its free memroy? You don't! So the
165: code must repeatedly call malloc for each buffer size and
166: detect when a new memory page has been allocated. Once the new
167: memory page is allocated, you can calculate the number of free
168: buffers in that page and request exactly that many more. Future
169: malloc requests will now be added at the top of a new memory page.
170:
171: One final point - the initial sbrk(0) is the value of sbrk(0)
172: after all of the above malloc hacking has been performed.
173:
174:
175: The following Emacs dump/load issues need to be addressed:
176:
177: 1) Loadup.el exits with an error message because the xemacs and
178: emacs-xxx files are not created during the dump function.
179:
180: Loadup.el should be changed to check for the new .emacs.data
181: file.
182:
183: 2) Dump will only support one .emacs.data file for the entire
184: system. This precludes the ability to allow each user to
185: define his/her own "dumped" Emacs.
186:
187: Add an environment symbol to override the default .emacs.data
188: path.
189:
190: 3) An error message "error in init file" is displayed out of
191: startup.el when the dumped Emacs is invoked by a non-root user.
192: Although all of the preloaded Lisp code is present, the important
193: purify-flag has not been set back to Qnil - precluding the
194: loading of any further Lisp code until the flag is manually
195: reset.
196:
197: The problem appears to be an access violation which will go
198: away if the read-write access modes to all of the files are
199: changed to rw-.
200:
201: 4) In general, all file access modes should be changed from
202: rw-r--r-- to rw-rw-rw-. They are currently setup to match
203: standard AIX access modes.
204:
205: 5) The dump function is not invoked when the automatic load of
206: loadup.el is performed.
207:
208: Perhaps the command arguments array should be expanded with
209: "dump" added to force an automatic dump.
210:
211: 6) The automatic initialization function alloc_shm will delete
212: the shared memory segment and .emacs.data file if the "dump"
213: command argument is found in ANY argument position. The
214: dump function will only take place in loadup.el if "dump"
215: is the third or fourth command argument.
216:
217: Change alloc_shm to live by loadup.el rules.
218:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.