|
|
1.1 root 1: .de Ls
2: .RS
3: .nr L 0 1
4: ..
5: .de Le
6: .RE
7: .LP
8: ..
9: .de Np
10: .IP (\\n+L) .25i
11: ..
12: .ds ar \v'2p'\s18\(->\s0\v'-2p'
13: .ds sd \s8\v'.2m'\h'-0.4n'
14: .ds su \v'-.2m'\s0
15: .ds ex \fIarg\fP
16: .ds e1 \fIarg\*(sd1\*(su\fP
17: .ds e2 \fIarg\*(sd2\*(su\fP
18: .ds e3 \fIarg\*(sd3\*(su\fP
19: .ds e4 \fIarg\*(sd4\*(su\fP
20: .ds e5 \fIarg\*(sd5\*(su\fP
21: .ds ei \fIarg\*(sdi\*(su\fP
22: .ds en \fIarg\*(sdn\*(su\fP
23: .ds e0 \fIarg\*(sd0\*(su\fP
24: .ds xx \fIexpr\fP
25: .ds x1 \fIexpr\*(sd1\*(su\fP
26: .ds x2 \fIexpr\*(sd2\*(su\fP
27: .ds x3 \fIexpr\*(sd3\*(su\fP
28: .ds x4 \fIexpr\*(sd4\*(su\fP
29: .ds x5 \fIexpr\*(sd5\*(su\fP
30: .ds xi \fIexpr\*(sdi\*(su\fP
31: .ds xn \fIexpr\*(sdn\*(su\fP
32: .ds x0 \fIexpr\*(sd0\*(su\fP
33: .ds v0 \fIvar\fP
34: .ds v1 \fIvar\*(sd1\*(su\fP
35: .ds v2 \fIvar\*(sd2\*(su\fP
36: .ds v3 \fIvar\*(sd3\*(su\fP
37: .ds vi \fIvar\*(sdi\*(su\fP
38: .ds vn \fIvar\*(sdn\*(su\fP
39: .ds ax \fIarg\fP
40: .ds a1 \fIarg\*(sd1\*(su\fP
41: .ds a2 \fIarg\*(sd2\*(su\fP
42: .ds a3 \fIarg\*(sd3\*(su\fP
43: .ds a4 \fIarg\*(sd4\*(su\fP
44: .ds a5 \fIarg\*(sd5\*(su\fP
45: .ds ai \fIarg\*(sdi\*(su\fP
46: .ds an \fIarg\*(sdn\*(su\fP
47: .ds a0 \fIarg\*(sd0\*(su\fP
48: .de St
49: .ta 1.0iR +.5i 4i
50: ..
51: .de S1
52: .ta 0.75i
53: ..
54: .de Pt
55: .ta 0.8i +0.8i +0.8i +0.8i +0.8i +0.8i +0.8i +0.8i
56: ..
57: .TR 83-10d
58: .DA "June 1983; Revised July 1983,\^ January 1984,\^ June 1984,\^ and August 1984"
59: .Gr
60: .TL
61: Porting the UNIX Implementation of Icon
62: .AU
63: William H. Mitchell
64: .AB
65: This document explains how to port the UNIX implementation of the
66: Icon programming language. The Icon system is composed of a translator,\^
67: a linker,\^ and an interpreter. Procedures for porting each system
68: component are described in detail. This document is meant to be a
69: companion to the Icon ``tour'' (TR 84-11) and the source code for
70: the system.
71: .AE
72: .SH
73: Introduction
74: .PP
75: This document describes how to port the Version 5 Icon interpreter
76: to a \*U environment.
77: .Un
78: There is both an interpreter and a compiler available for Icon; this
79: document only addresses porting the interpreter.
80: The Icon system has three major components:
81: a translator,\^ a linker,\^ and an interpreter. The translator and
82: the linker are entirely written in C and porting them is merely a
83: matter of setting constant values that are appropriate for the target
84: machine. Portions of the interpreter are written in assembly language and
85: thus must be written anew for each machine. The interpreter also
86: contains a very small amount of C code that must be written on a
87: per-machine basis.
88: .PP
89: The sections of this document that describe the porting of the
90: translator and the linker are straightforward,\^ being merely a
91: description of a process. While porting the translator and the
92: linker is a task of following instructions,\^ porting the interpreter
93: is a task of design and programming. The approach taken is to describe
94: what function each routine must perform and how it is implemented in
95: the VAX\u\(dg\d version of Icon. The porter's job is to determine how to
96: .FS
97: \u\(dg\dVAX is a trademark of Digital Equipment Corporation.
98: .FE
99: implement the various routines on the target machine.
100: .PP
101: In light of
102: the increasing popularity of the C language and the availability of C
103: compilers for non-UNIX environments,\^ it is quite possible that one may
104: desire to port Icon to a non-UNIX environment.
105: Because the matter of porting a UNIX program to
106: a non-UNIX environment is a problem in itself,\^ it is not addressed in
107: this document. Rather,\^ this document assumes that the target
108: environment is UNIX. This is not to say that porting Icon to a non-UNIX
109: environment is not feasible. Icon is not strongly bound to UNIX,\^ the
110: primary association being that Icon is written in C. It is
111: anticipated that most C systems that are available for a non-UNIX
112: environment will provide most of the UNIX-independent C standard
113: functions as part of a library. If such a library is available,\^
114: it should be possible to port Icon without great difficulty.
115: .PP
116: This document is a companion document of the Icon
117: ``tour''\^[1] and should be studied with the source code for
118: Version 5.8 of Icon at hand.
119: In particular,\^ the porter should be familiar with the information
120: contained in the tour.
121: .PP
122: The sections of this document that describe the VAX assembly language
123: code attempt to explain the operation of instructions when the
124: operation is not obvious. However,\^ this document does
125: assume that the porter has
126: a rudimentary familiarity with the basic concepts of the VAX-11
127: architecture\^[2].
128: .SH
129: C Compiler Requirements
130: .PP
131: Because there is no
132: standard for the C programming language,\^ it is difficult to say how
133: ``standard'' the usage of C in the system is. The system was
134: developed using the V7 C compiler,\^ often referred to as the Ritchie
135: compiler\^[3]. The system was later ported to the VAX using the \fIPortable
136: C Compiler\fP\^[4] and no serious problems were encountered.
137: .PP
138: In addition to supporting ``full'' C,\^ a few specific requirements and
139: non-requirements are made on the C compiler:
140: .Ls
141: .Np
142: The compiler must support both assignment and call-by-value for
143: structures.
144: .Np
145: The compiler need not support bit field operations.
146: .Np
147: Arguments to C functions must be stored in consecutive,\^
148: ascending memory locations.
149: .Np
150: There may be problems if \*Msizeof(int)\fP and \*Msizeof(char *)\fP
151: are not the same,\^ but no definite problems are known.
152: .Np
153: It is believed that there are great,\^ perhaps insurmountable
154: problems,\^ if \*Msizeof(char *)\fP is not equal to \*Msizeof(int *)\fP.
155: .Le
156: .SH
157: System Testing
158: .PP
159: The test programs and testing procedures to be used for porting Icon
160: are described in \^[5].
161: At various points in this document,\^ the porter is directed
162: to test the system component just completed. At such times,\^ the
163: porter should refer to \^[5] to determine what should be done.
164: .NH 1
165: Porting the Icon Translator
166: .NH 2
167: Overview
168: .PP
169: The Icon translator,\^ known as \*Mitran\fR,\^ is the first
170: logical component of the Icon system. The translator takes Icon
171: source files as input and produces two \fIucode\fR output files
172: for each input file.
173: The translator may be run by saying:
174: .Ds
175: itran hello.icn
176: .De
177: This produces two ascii files,\^
178: \*Mhello.u1\fR and \*Mhello.u2\fR. \*Mhello.u1\fR
179: contains interpretable instructions and data in a printable
180: format. \*Mhello.u2\fR contains information about global symbols and
181: scope.
182: .PP
183: The translator is written entirely in C and is the most machine
184: independent major system component. No serious
185: problems should be encountered in porting it. If difficulties are
186: encountered,\^ it probably indicates that there are major problems
187: with the C compiler being used.
188: .NH 2
189: Porting Procedure
190: .PP
191: The Icon system contains a number of instances of values that must be
192: specified on a per-machine basis. The system also contains assembly
193: code and,\^ of course,\^ such code is different on each machine. Rather
194: than maintaining a source copy of Icon for each machine that Icon runs
195: on,\^ C preprocessor control statements are used to select portions of
196: code specific to a certain machine. The source as distributed can
197: be compiled on either a VAX and PDP-11* system by defining \*MVAX\fP or
198: .FS
199: *PDP is a trademark of Digital Equipment Corporation.
200: .FE
201: \*MPDP11\fP respectively in \*Mh/config.h\fP. The porting source has
202: neither \*MVAX\fP or \*MPDP11\fP defined; rather,\^ \*MPORT\fP
203: is defined. Where machine specific code is to appear,\^ along with
204: sections bracketed by \*M#define\fPs for \*MVAX\fP and \*MPDP11\fP,\^
205: there is a skeletal section bracketed by a \*M#define\fP for
206: \*MPORT\fP. The \*MPORT\fP section is to be filled out for the
207: target machine. This convention is followed throughout and porting
208: Icon is nothing more than filling in all the \*MPORT\fP sections.
209: .PP
210: The source for the translator is contained in the directory \*Mtran\fP.
211: Translator machine dependencies are confined to the file \*Mtran/sym.h\fR.
212: A pair of constants define the sizes of two data structures used during the
213: translation process.
214: Edit the file \*Msym.h\fR and search for the string \*MPORT\fR.
215: The code looks something like
216: .Ds
217: .ta 2.0iR 2.5i
218: #ifdef PORT
219: #define TSIZE x /* default size of parse tree space */
220: #define SSIZE x /* default size of string space */
221: #endif PORT
222: #ifdef VAX
223: #define TSIZE 15000 /* size of parse tree space */
224: #define SSIZE 15000 /* default size of string space */
225: #endif VAX
226: #ifdef PDP11
227: #define TSIZE 5000 /* default size of parse tree space */
228: #define SSIZE 5000 /* default size of string space */
229: #endif PDP11
230: .De
231: The values of \*MTSIZE\fP and \*Mssize\fP are not critical
232: and current values have been chosen rather arbitrarily.
233: If you are on a large
234: machine,\^ use the values of \*MTSIZE\fR and \*MSSIZE\fR specified for
235: the VAX; otherwise,\^ use the values specified for the PDP-11.
236: .PP
237: The translator may now be compiled by issuing the \fImake\fR command
238: without any arguments.
239: .PP
240: It should be noted that although Icon programs are used to create
241: some of the translator source files (namely \*Mkeyword.h\fP,\^ \*Mkeyword.c\fP
242: \*Moptab.c\fP,\^ and \*Mtoktab.c\fP). These files are machine independent
243: and do not need to be remade. If for some reason \*Mmake\fP tries to
244: create any of these files,\^ just \*Mtouch\fP the file in question to
245: update the last-modified date. Similarly,\^ \*Mparse.c\fP is generated by
246: \fIyacc\fP and does not need to be regenerated unless the grammar is
247: modified.
248: .PP
249: When the translator has been successfully compiled using \fImake\fP,\^
250: refer to [5] for testing.
251: .PP
252: Porting the translator may seem like a trivial task,\^
253: but its successful completion is a definite milestone because
254: it is good sign that the C compiler in use is suitable.
255: .nr $1 1
256: .nr $2 0
257: .NH 1
258: Porting the Icon Linker
259: .NH 2
260: Overview
261: .PP
262: The Icon linker,\^ known as \*Milink\fP,\^ is the second logical component
263: of the Icon system. The linker takes \*Mu1\fP and \*Mu2\fP files
264: produced by the translator and binds them together to form an
265: \fIinterpretable\fP file. The interpretable file serves as input
266: for the Icon interpreter.
267: The linker is written entirely in C and is a fairly small and
268: simple program. However,\^ the interpretable files produced by
269: the linker are not machine independent and because of this,\^
270: porting the linker is more troublesome than porting the
271: translator.
272: .PP
273: Interpretable files contain two distinct types of data: opcodes and
274: associated operands that the interpreter ``understands''; and data that
275: is directly mapped into run-time data structures. By ``mapping'' it
276: is meant that the data is loaded into memory and then C structure
277: references are used to access elements of the object at a certain
278: location in memory.
279: The formats of
280: the opcodes and operands must conform to what the interpreter is
281: expecting. The data that is directly mapped must conform to the
282: format of the C data structures used by the run-time system.
283: .PP
284: On the VAX,\^ for example,\^ interpreter opcodes are one byte long
285: and operands are four bytes long. On the PDP-11,\^ opcodes are
286: also one byte long,\^ but operands are only two bytes long.
287: Opcode and operand size are fairly arbitrary,\^ but it is important
288: that the linker and the interpreter be coordinated.
289: .PP
290: The mapped data structures are slightly more complicated because
291: the linker must conform to the format produced by the C compiler.
292: This is not difficult,\^ since the data structures involved have
293: a regular form. All are composed of some number of \fIwords\fP
294: where each word is the same size in every structure.*
295: .FS
296: *Literature about the VAX conventionally uses the term \fIword\fP to refer
297: to 16-bit quantities and the term \fIlongword\fP to refer to 32-bit
298: quantities. In this document,\^ \fIword\fP in a generic context
299: refers to the basic unit of the run-time data structures; \fIword\fP
300: in a VAX-specific context refers to a 32-bit quantity.
301: .FE
302: .PP
303: The opcodes,\^ operands,\^ and mapped data are accumulated in memory during the
304: linking process. This conglomerate is referred to as the \fIcode\fP
305: section. Several routines are used to add data to the code
306: section. These routines are parameterized so that porting the linker
307: to a new machine is merely a matter of setting the parameters
308: correctly. Four primitive data units
309: compose the code section. These are \fIopcodes\fP,\^ \fIoperands\fP,\^
310: \fIwords\fP,\^ and \fIblocks\fP.
311: .IP opcodes
312: .br
313: are instructions for the
314: interpreter. An opcode may direct the interpreter to push a value
315: on the stack,\^ branch to a location,\^ perform an arithmetic operation,\^
316: etc.
317: .IP operands
318: .br
319: are associated with some opcodes. For
320: example,\^ the \*Mgoto\fP instruction has a location to branch to as
321: its single operand.
322: .IP words
323: .br
324: compose mapped data structures. A word is the basic
325: unit of the run-time data structures and should consist
326: of \*Msizeof(int *)\fP bytes.
327: .IP blocks
328: .br
329: are merely some number of bytes. For example,\^ a \*Mcset\fP constant
330: is loaded into the code section as a block of 32 8-bit bytes (256 bits).
331: .PP
332: Routines in \*Mlink/lcode.c\fP are used to add a unit
333: of data of one of the preceding types to the code section. These
334: routines are \*Moutop\fP,\^ \*Moutopnd\fP,\^ \*Moutword\fP,\^ and
335: \*Moutblock\fP. Each routine adds the appropriate data into
336: the code section at the current location (maintained as a pointer),\^
337: and then the location pointer is advanced to the next free location.
338: .NH 2
339: Porting Procedure
340: .PP
341: Edit \*Milink.h\fP and search for the string \*MPORT\fP. Define
342: the following constants as described.
343: .IP \*MINTSIZE\fP
344: .br
345: The number of bits in an \*Mint\fP.
346: .IP \*MLOGINTSIZE\fP
347: .br
348: The base 2 log of \*MINTSIZE\fP.
349: That is,\^ \*MLOGINTSIZE\fP answers
350: the question ``\fIWhat power of 2 is \*MINTSIZE\fR\^?''.
351: .IP \*MLONGS\fP
352: .br
353: Icon has an integer data type. On the VAX and the PDP-11 the range
354: of integer values is \-2\u\s-231\s0\d to 2\u\s-231\s0\d-1. On the VAX,\^
355: C \*Mint\fPs and \*Mlong\fPs are both 32 bits wide. On the PDP-11,\^
356: C \*Mint\fPs are 16 bits wide while \*Mlong\fPs are 32 bits wide.
357: The PDP-11 Icon system makes an internal distinction between integers
358: that ``fit'' in 16 bits and integers that require 32 bits.
359: The former are stored in two-word descriptors (the actual value being
360: in the second of the two 16-bit words),\^ while the latter have a
361: value descriptor that points to a block in the heap that holds the
362: two-word,\^ 32-bit value. On the other hand,\^ the VAX uses two 32-bit words
363: for descriptors and thus the second word of a descriptor can hold
364: the largest possible integer value used by Icon.
365: Rather than having an internal distinction between integer
366: types on the VAX,\^ integers are always represented by two-word
367: integer descriptors. There are places in the code where special
368: provisions must be made if C \*Mint\fPs are not the same size
369: as C \*Mlong\fPs.
370: .sp
371: If \*Msizeof(int) != sizeof(long)\fP for the C compiler in use,\^
372: define \*MLONGS\fP. (\*MLONGS\fP need not be given a value,\^
373: \*M#define LONGS\fP is sufficient.)
374: If \*MLONGS\fP must be defined,\^ the minimum
375: and maximum values that can be represented by an \*Mint\fP must also
376: be defined. Define \*MMINSHORT\fP to be the smallest value that an
377: \*Mint\fP can hold and define \*MMAXSHORT\fP to be the largest value that
378: an \*Mint\fP can hold.
379: .IP \*MMAXCODE\fP
380: .br
381: This is the maximum size in bytes of the code that can be generated for each
382: procedure. This value is not critical; 10,\^000 is used for the VAX,\^
383: while 2000 is used for the PDP-11.
384: .IP \*Mstrchr\fP\ and\ \*Mstrrchr\fP
385: .br
386: If you are on a USG UNIX system,\^ \*M#define\fP \*Mindex\fP to be
387: \*Mstrchr\fP and \*Mrindex\fP to be \*Mstrrchr\fP.
388: .PP
389: Edit \*Mdatatype.h\fP and
390: search for the \*MPORT\fP section. This section contains
391: \*M#define\fPs that are used to set and test flags contained in the
392: first word of descriptors. The basic idea in forming these
393: constants is to set some bits at the high end of the word,\^ and set some
394: other bits at the low end. The number of unused bits in the middle
395: depends on the size of a word.
396: .PP
397: \*MF_NQUAL\fP,\^ \*MF_VAR\fP,\^ \*MF_TVAR\fP,\^ \*MF_PTR\fP,\^ \*MF_NUM\fP,\^
398: \*MF_INT\fP,\^ and \*MF_AGGR\fP should be set to mask values with one
399: bit set to 1 in each. For \*MF_NQUAL\fP,\^ the leftmost bit should set,\^
400: for \*MF_VAR\fP,\^ the next to leftmost bit should be set,\^ and so
401: forth. The values for the VAX and PDP-11 should be suitable for
402: machines with 32-bit and 16-bit words,\^ respectively.
403: .PP
404: The constants
405: \*MOPSIZE\fP,\^ \*MOPNDSIZE\fP,\^ and \*MWORDSIZE\fP control the
406: sizes of opcodes,\^ operands,\^ and words in the code section.
407: Before setting these constants to values appropriate for the
408: target machine,\^ a ``standard'' linker should be built and tested using
409: the supplied values (under \*MPORT\fP) for these constants.
410: This allows the linker to be checked against output files that
411: are known to be correct. The purpose of this is to attempt to
412: discover C compiler problems. Compile the linker using \*Mmake\fP and
413: refer to
414: [5] for the testing procedure.
415: .PP
416: Once the ``standard'' linker has been checked out,\^ the following
417: ``sizing'' parameters in \*Milink.h\fP should be set to values
418: appropriate for the target machine.
419: .IP \*MOPSIZE\fP
420: .br
421: This is the size in bytes of interpreter opcodes. The interpreter treats
422: opcodes as unsigned quantities. One byte (8 bits) is currently
423: large enough to accommodate all opcodes and a value of 1 is recommended
424: for \*MOPSIZE\fP. The \*Moutop\fP routine in \*Mlcode.c\fP assumes that
425: opcodes are one byte. If a larger size is desired,\^ \*Moutop\fP
426: will have to be recoded. It might be wise to use a value other than
427: 1 for \*MOPSIZE\fP on machines that are not byte-addressable and have
428: ample memory.
429: .IP \*MOPNDSIZE\fP
430: .br
431: This is the size in bytes of operands for interpreter instructions.
432: For some instructions,\^ the operand value represents an offset
433: from the interpreter program counter and thus,\^ the maximum possible
434: offset is limited by the magnitude of values that can be represented
435: in \*MOPNDSIZE\fP bytes. Because larger operands occupy more code
436: space and smaller operands limit addressing ``distance'',\^ a trade-off
437: is involved. On the VAX,\^ operands are four bytes because memory
438: space is not very critical. On the PDP-11,\^ operands are two bytes
439: because of the limited memory. While it is easy to change the
440: value of \*MOPNDSIZE\fP in the linker,\^ the operand size is pervasive
441: in the interpreter. If the target machine has a large,\^ perhaps
442: virtual address space,\^ use a value such as 4 for \*MOPNDSIZE\fP. A
443: value such as 2 may be appropriate for a smaller machine. A value
444: of 1 is not advisable under any circumstances. The suggested value
445: for \*MOPNDSIZE\fP is \*Msizeof(int)\fP.
446: .IP \*MWORDSIZE\fP
447: .br
448: This should be set to \*Msizeof(int *)\fP on the target machine. The
449: various run-time data structures are all composed of a number
450: of words each of which contain \*MWORDSIZE\fP bytes. For example,\^
451: the data blocks for user-defined procedures are built in the code
452: section by a sequence of calls to \*Moutword\fP.
453: .PP
454: The \*Mbackpatch\fP routine in \*Mlcode.c\fP needs some
455: machine-specific modifications. This routine backpatches forward
456: references to ucode labels. In the \fIwhile\fP loop,\^ the operand (which is
457: \*MOPNDSIZE\fP bytes long) that is pointed at by \*Mq\fP is loaded
458: into the variable \*Mp\fP. Then,\^ the operand is replaced by the
459: value of \*Mr\fP. On the VAX,\^ this can be expressed as:
460: .Ds
461: p = *q;
462: *q = r;
463: .De
464: where \*Mq\fP is an \*Mint *\fP. This is possible because the VAX allows
465: word references on an arbitrary boundary. On the PDP-11,\^ such
466: references are illegal and the assignments must be made on a byte-wise
467: basis. If the target machine allows word accesses on arbitrary
468: boundaries,\^ the VAX code may be used (assuming \*MOPNDSIZE\fP is equal to
469: \*Msizeof(int)\fP).
470: If not,\^ but operands are the same size as \*Mint\fPs,\^
471: the PDP-11 code may be used.
472: Other situations may require ingenuity. Be sure to alter the first \*MPORT\fP
473: section in \*Mbackpatch\fP to contain an appropriate declaration for
474: \*Mq\fP (that section currently contains a declaration for \*Mq\fP and
475: a \*Mreturn\fP).
476: .PP
477: When the linker has been compiled,\^
478: refer to [5] for directions on
479: testing.
480: .nr $1 2
481: .nr $2 0
482: .NH 1
483: Porting the Icon Interpreter
484: .NH 2
485: Introduction
486: .PP
487: The Icon interpreter,\^ known as \fIiconx\fP,\^ is the third major logical
488: component of the system. The interpreter takes
489: interpretable files produced by the linker and ``executes'' them.
490: The interpreter is run by:
491: .Ds
492: iconx hello
493: .De
494: where \*Mhello\fP has been produced by the linker.
495: .PP
496: Due to the stack manipulations that the interpreter performs,\^ it is
497: necessary for a small portion of the interpreter to be written in
498: assembly language rather than in C. On the VAX,\^ about 550 lines of
499: assembly instructions are required. The coding of these assembly
500: instructions is the most difficult part of the
501: port.
502: .NH 2
503: Source File Layout
504: .LP
505: The interpreter is divided into four parts:
506: .DS
507: .ft R
508: start-up code
509: the main loop
510: primary subroutines
511: support subroutines
512: .DE
513: .LP
514: The start-up code initializes the interpreter and passes control
515: to the main loop. The main loop,\^ referred to as \*Minterp\fP,\^
516: fetches interpreter instructions
517: and executes them. An interpreter instruction may be entirely
518: performed by \*Minterp\fP or \*Minterp\fP may call a \fIprimary
519: subroutine\fP to perform the operation. In turn,\^ a primary subroutine
520: may call a number of \fIsupport subroutines\fP.
521: Each primary subroutine has a direct correspondence to a source language
522: operation of some type or to a stack manipulation.
523: .PP
524: While the translator and linker sources files are in their own
525: directories,\^ the interpreter source files are segregated into several
526: directories.
527: .nr a \w'\*Moperators\fR'+1m
528: .IP \*Miconx\fP (\na)u
529: The start-up code and the main interpreter loop reside in this
530: directory. Files of particular interest are: \*Mstart.s\fP,\^ which
531: is entered when the interpreter is run and does some low-level
532: initialization; \*Minit.c\fP,\^ which is called from \*Mstart.s\fP and
533: completes initialization of the interpreter; and \*Minterp.s\fP,\^ which
534: is the interpreter loop itself.
535: .IP \*Mfunctions\fP (\na)u
536: This directory contains code for the built-in procedures.
537: For example,\^ \*Mwrite.c\fP contains
538: the source for the \*Mwrite\fP function. The source for each
539: built-in procedure appears in a file of its own.
540: .IP \*Moperators\fP (\na)u
541: This directory contains code for the Icon operators. The routines in
542: this directory
543: implement the various Icon source level operators. For
544: example,\^ \*Mplus.c\fP is called to perform the \*M+\fP (addition)
545: operation,\^ and \*Mbang.c\fP is called to perform the \*M!\fP (element
546: generation) operation. As with the built-in procedures,\^ there is
547: one operator per file.
548: .IP \*Mlib\fP (\na)u
549: This directory contains routines that
550: do not fit anywhere else. First of all,\^ there is
551: code for routines that perform actions similar in nature to those in
552: \*Mfunctions\fP and \*Moperators\fP,\^ but that do not have a functional or
553: operator syntax. For example,\^ \*Mllist.c\fP creates a list
554: that is specified syntactically as
555: \*M\^[\*(e0,\^\*(e1,\^\*(El,\^\*(en]\fR,\^ and \*Mfield.c\fP handles record element
556: accesses that arise from \*M\*(e1.\*(e2\fR.
557: .sp .8
558: \*Mlib\fP also contains routines such as \*Mesusp.s\fP and \*Mefail.s\fP
559: that handle stack manipulations during expression evaluation. The
560: routines \*Mpret.s\fP and \*Mpfail.s\fP handle procedure return and
561: failure respectively.
562: .sp .8
563: The directories \*Mfunctions\fP,\^ \*Moperators\fP,\^ and \*Mlib\fP
564: compose the primary subroutines mentioned above.
565: .IP \*Mrt\fP (\na)u
566: The support subroutines are contained in the \*Mrt\fP directory.
567: The primary subroutines are autonomous with respect to each other and
568: use the \*Mrt\fP routines for common operations. For example,\^
569: \*Mcvstr.c\fP is used to convert a value to a string,\^ \*Mtrace.c\fP
570: produces various types of tracing messages,\^ and \*Mgc.c\fP is the
571: garbage collector.
572: .IP \*Mh\fP (\na)u
573: This directory contains a number of header files that are
574: \*M#include\fPd in the other files that compose the interpreter.
575: Of particular interest is \*Mrt.h\fP,\^ which defines a number of
576: constants and data structures.
577: .NH 2
578: Overview of the Porting Process
579: .PP
580: The following steps are to be followed when porting the interpreter.
581: .Ls
582: .Np
583: Determination of layout of procedure,\^ generator,\^ and expression
584: markers and associated frame pointers.
585: .Np
586: Setting of implementation specific constants in \*Mh/rt.h\fP and
587: creation of \*Mh/defs.s\fP from \*Mrt.h\fP.
588: .Np
589: Complete system compilation.
590: .Np
591: Coding of a ``basis'' of routines for the interpreter,\^ consisting of
592: \*Miconx/start.s\fP,\^ \*Mrt/setbound.s\fP,\^ \*Mlib/invoke.s\fP,\^
593: \*Miconx/interp.s\fP,\^ \*Mlib/efail.s\fP,\^
594: \*Mlib/pfail.s\fP.
595: .Np
596: Testing of the basis routines for the interpreter.
597: .Np
598: Coding and testing of
599: .Ds
600: rt/arith.s
601: rt/fail.s
602: lib/pret.s
603: lib/esusp.s
604: lib/lsusp.s
605: lib/psusp.s
606: rt/suspend.s
607: functions/display.c
608: .De
609: in an incremental fashion. Test programs are provided
610: to test the system after adding each routine.
611: .Np
612: Coding of \*Mrt/gcollect.s\fP and \*Mrt/sweep.c\fP.
613: Testing of garbage collection.
614: .Np
615: Complete system testing.
616: .Le
617: .PP
618: This document does not explain how to port the sections of the system
619: that are related to co-expressions. The involved files are
620: \*Mlib/coact.s\fP,\^ \*Mlib/cofail.s\fP,\^ \*Mlib/coret.s\fP,\^
621: \*Mlib/create.c\fP,\^ and \*Moperators/refresh.c\fP. Icon
622: works properly with these sections of code left unimplemented,\^
623: provided no attempt is made to use co-expressions,\^ in which case
624: the system notes it as a fatal error.
625: .NH 2
626: Porting Procedure
627: .SH
628: Determination of Frame Layouts
629: .PP
630: Unfortunately,\^ one of the most far-reaching decisions that must be
631: made during the porting process is also one of the first decisions
632: that must be made. The decision (actually,\^ a number of decisions) is
633: how to layout the procedure,\^ generator,\^ and
634: expression frames and what registers should be used as frame
635: pointers. The various frames and their usages are explained in
636: detail in \^[1] and the portions of this document that describe
637: routines that manipulate a particular frame also provide further
638: explanations. The porter should have a good
639: understanding of what the frames are used for before setting frame
640: layouts as they are pervasive throughout the assembly language
641: portions of the system.
642: .PP
643: This document is rather tightly bound to the VAX implementation of
644: Icon. Because of this,\^ the stack model that is used is that
645: of the VAX. Specifically,\^ the VAX stack starts in high memory and
646: grows downward. Thus,\^ when something is pushed on the stack,\^ the
647: stack pointer goes down. When something is removed,\^ the stack pointer
648: goes up.
649: The only time that this convention is departed from is in the use of
650: the phrase ``the top of the stack''. The top of the stack is the
651: stack word that has the \fIlowest\fP memory address.
652: .PP
653: The procedure frame layout is the first to be determined. The layout
654: is somewhat fixed by the C compiler and target machine,\^
655: so the task is a combination
656: of making a decision and also recognizing what has been pre-determined.
657: On most machines,\^ the
658: task of the porter is more one of recognition than of design.
659: .PP
660: The first thing to determine is the
661: frame layout imposed by the target machine and the C compiler.
662: Create a file containing the following
663: .Ds
664: f()
665: {
666: x(1,\^2);
667: }
668: .De
669: Compile the file using \fIcc\fP
670: in such a manner as to catch the assembly code that is
671: generated in a file. The \*M\-S\fP option of \fIcc\fP should cause
672: assembly code to be placed in a file. On the VAX,\^ the code
673: generated by \*Mx(1,\^2)\fP is
674: .Ds
675: .ta .7i
676: pushl $2
677: pushl $1
678: calls $2,\^_x
679: .De
680: From this it can be seen that arguments are pushed on the stack
681: using the \*Mpushl\fP instruction,\^ and that the \*Mcalls\fP
682: instruction does the actual procedure call. The first argument
683: to \*Mcalls\fP is the number of arguments that are on the stack.
684: When a return is made from a procedure called with a \*Mcalls\fP
685: instruction,\^ the arguments are removed from the stack by the return
686: mechanism. On some machines,\^ the removal of arguments after a subroutine
687: call is left to the programmer (or code generator,\^ in this case).
688: This is usually done by adding a value to the stack pointer or
689: incrementing the stack pointer several times.
690: .PP
691: Examine the assembly code produced on the target machine by the
692: given C statements. Determine what actions are taken by the machine
693: when the appropriate call instruction is performed. It is important
694: to completely and totally understand what the target machine does
695: when a call is performed.
696: Next,\^ determine what sort of procedure frame is used by C routines.
697: Compile the following C function using \*M\-S\fP.
698: .Ds
699: .ta .7i
700: f(a,\^b,\^c)
701: int a; char b; char *c;
702: {
703: int x,\^y;
704:
705: x = a;
706: a = 1;
707: y = 2;
708: }
709: .De
710: Look at the generated
711: code and try to get a feel for what is going on. The things that
712: need to be determined are:
713: .Ds
714: .ft R
715: how arguments are accessed
716: the format of the C call frame
717: register saving and restoring conventions
718: .De
719: For example,\^ on the VAX,\^ the following code is generated
720: for the test procedure.
721: .Ds
722: .Pt
723: .word L12 register save mask,\^ filled in later
724: jbr L14 jump to end to make stack space
725: L15:
726: movl 4(ap),\^-4(fp) x = a
727: movl $1,\^4(ap) a = 1
728: movl $2,\^-8(fp) y = 2
729: ret return
730: .set L12,\^0x0 set register mask
731: L14:
732: subl2 $8,\^sp make room for two local variables
733: of four bytes each
734: jbr L15 jump to start of routine
735: .De
736: Several inferences can be made. First of all,\^ arguments are accessed
737: relative to \*Map\fP,\^ the argument pointer. Secondly,\^ local variables
738: are accessed relative to \*Mfp\fP,\^ the frame pointer. On the VAX,\^
739: because of the hardware register save and restoration based on the
740: entry mask (the first word of the routine),\^ no subroutine calls are
741: required to save registers.
742: .PP
743: The Icon procedure frame must have the following attributes:
744: .Ls
745: .Np
746: The values on the stack at the time of call to the procedure appear
747: as arguments to the procedure. Furthermore,\^ the values must be
748: accessible in a deterministic fashion.
749: .Np
750: Register values are saved in the frame and can be accessed deterministically.
751: .Np
752: \*M_line\fP and \*M_file\fP appear in the procedure frame just below
753: the last word pushed on the stack as part of the C procedure calling
754: protocol.
755: .Np
756: The region for local variables begins at the lower end of the
757: ``constant'' portion of the frame. Local variables must be
758: be accessible via deterministic means.
759: .Np
760: The procedure frame created by a C procedure call must be a subset
761: of the procedure frame selected. That is,\^ the Icon procedure frame
762: must be an augmentation of the C procedure frame.
763: .Le
764: .LP
765: The VAX uses this procedure frame layout:
766: .Ds
767: .St
768: .ft R
769: arguments
770: 4 number of arguments (\*Mnargs\fR)
771: \*Map\fR \*(ar 0 number of words in argument list (\*Mnwords\fR)
772: saved \*Mr11\fR (\*Mefp\fR)
773: saved \*Mr10\fR (\*Mgfp\fR)
774: \*(El
775: last saved register
776: 16 saved \*Mpc\fR
777: 12 saved \*Mfp\fR
778: 8 saved \*Map\fR
779: 4 program status word and register mask
780: \*Mfp\fR \*(ar 0 0 (condition handler address)
781: -4 saved value of \*M_line\fR
782: -8 saved value of \*M_file\fR
783: \*Msp\fR \*(ar Icon local variables
784: .De
785: .PP
786: Actually,\^ on the VAX,\^ most of the decisions are predetermined by
787: the VAX architecture. The arguments are present on the stack,\^ so they
788: are the high end of the frame. The registers are saved on the stack
789: by the \*Mcalls\fP instruction. The values of \*M_line\fP and \*M_file\fP
790: naturally fit after the saved registers. The locals then appear on
791: the lower end and extend for a variable distance (on a per-procedure
792: basis). Note that the first local is at \*M\-16(fp)\fP and the \fIlast\fP
793: argument is at \*M8(ap)\fP.
794: .PP
795: The VAX hardware takes care of saving and restoring registers upon
796: subroutine entry and exit. It is quite possible that the target
797: machine will not have this capability and the task must be delegated
798: to software. This is usually evidenced by a call to a routine with a
799: name such as \*Mcsave\fP as the very first thing in the routine and a
800: call to a routine with a name such as \*Mcrestore\fP at the end of a
801: routine. If this is the case,\^ the actions of
802: the saving and restoring routines must be taken into account when determining
803: the procedure frame layout.
804: .PP
805: In addition to determining the procedure frame layout,\^ a procedure
806: frame pointer must also be selected. On the VAX,\^
807: the \*Mfp\fP stays constant throughout execution of a C procedure;
808: it is used as the procedure frame pointer. For the target machine,\^
809: there should be some register on which references to local variables
810: (and perhaps parameters) are based. That register should be used
811: as the procedure frame pointer (sometimes referred to as the
812: \*Mpfp\fP).
813: The \*Mpfp\fP need not point at the lowest word pushed on the stack
814: as part of the procedure call; it only needs to be constant while
815: a procedure is executing. Of course,\^ the \*Mpfp\fP changes while the
816: program is executing; by ``pointing at'' a particular word,\^ it is
817: meant that the \*Mpfp\fP always references a certain word in the
818: procedure frame marker. An \*Mrt.h\fP constant,\^ \*MFRAMELIMIT\fP,\^ is
819: dependent on the number of words between the lowest word of the
820: procedure marker and the word that the \*Mpfp\fP points to. Setting
821: \*MFRAMELIMIT\fP is described below.
822: .PP
823: A point about terminology should be stressed. The procedure frame marker is
824: bounded by arguments on one end and the Icon local variables on the other.
825: A procedure marker,\^ the arguments,\^ the Icon
826: local variables,\^ and the stack below
827: the local variables compose a procedure frame.
828: .\"\^[Note in here about variable size of marker,\^ forced saving of efp,\^ gfp.,\^
829: .\"ap being needed]
830: .PP
831: Determining the procedure frame layout
832: is by no means a deterministic process. It takes work,\^ but once
833: it's successfully set,\^ the single hardest task of the port is complete.
834: .PP
835: Once the procedure frame has been set,\^ the generator frame layout follows
836: rather easily. A generator frame is merely an augmented procedure
837: frame. The generator frame has two additional pieces of information,\^
838: a saved value of \*M_k_level\fP,\^ and a saved value for the boundary.
839: It is recommended that the generator frame
840: be identical to a procedure frame except that the two extra words
841: required be located between the lowest word that is pushed on the
842: stack by the procedure call mechanism and the saved value of \*M_line\fP.
843: Thus,\^ on the VAX,\^ the generator frame \fImarker\fP is
844: .Ds
845: .ft R
846: .St
847: saved \*Mr11\fR
848: saved \*Mr10\fR
849: \*(El
850: last saved register
851: 20 reactivation address
852: 16 saved \*Mfp\fR
853: 12 saved \*Map\fR
854: 8 program status word and register mask
855: 4 0 (condition handler address)
856: \*Mgfp\fR \*(ar 0 saved value of the boundary
857: -4 saved value of \*M_k_level\fR
858: -8 saved value of \*M_line\fR
859: -12 saved value of \*M_file\fR
860: .De
861: Note that instead of a saved \*Mpc\fR value,\^ the generator frame marker
862: holds a reactivation address. Control passes to this address when
863: the generator is reactivated. Reactivation is fully explained in
864: later sections.
865: .PP
866: A generator frame pointer (\*Mgfp\fP) is associated with a generator
867: frame. On the VAX,\^ \*Mr10\fP is the \*Mgfp\fP. The choice of a \*Mgfp\fP
868: is indirectly determined by the machine architecture and is
869: intertwined with the selection of an expression frame pointer.
870: The selection of the register to use as the \*Mgfp\fP is discussed below.
871: It is recommended that the \*Mgfp\fP point
872: at the word containing the saved boundary value.
873: .PP
874: The third type of frame marker is the expression frame marker.
875: Expression frame markers are totally machine independent and contain
876: three pieces of information: a saved expression marker address,\^
877: a saved generator marker address,\^ and a failure label that is to
878: be given control in certain circumstances. On the VAX,\^ the
879: expression marker layout is
880: .Ds
881: .ft R
882: .St
883: \*Mefp\fR \*(ar 0 saved \*Mefp\fR value
884: -4 saved \*Mgfp\fR value
885: -8 failure address
886: .De
887: This same format should be used on the target machine and there is no
888: apparent reason for needing an alternative format. The expression
889: frame pointer (\*Mefp\fP) should point at the high word of the
890: expression marker.
891: .PP
892: The registers that should be used for the \*Mgfp\fP and \*Mefp\fP are
893: indirectly dependent on the procedure call mechanism. The primary
894: requirement for the registers used as the \*Mefp\fP and \*Mgfp\fP is that
895: they are saved across procedure calls. The secondary requirement is
896: that the \*Mgfp\fP and \*Mefp\fP always be saved in a procedure frame.
897: If the target machine has two general purpose registers that are
898: always saved in a procedure frame,\^ those two registers are quite
899: suitable for the \*Mgfp\fP and \*Mefp\fP.
900: .PP
901: If the procedure call mechanism does not always save a pair of general
902: purpose registers,\^ the problem is more complicated.
903: There are stack manipulations that are performed that
904: \fIrequire\fP saved values of \*Mefp\fP and \*Mgfp\fP to be present
905: in procedure and generator frames.
906: For built-in procedures and Icon procedures
907: this is no problem because \*Minvoke\fP creates the procedure frame
908: for them and can insure that the registers are saved. On the VAX,\^ for the C
909: routines that are directly called from \*Minterp\fP,\^ no such
910: assurances can be made because the VAX C compiler directs
911: only the registers used in a routine to be saved in the C procedure
912: frame. This creates a problem because Icon counts on the registers
913: being saved. The problem is countered by making the C compiler think
914: that certain registers are used in certain routines. Specifically,\^
915: declarations for a pair of \*Mregister int\fP variables are placed at
916: the start of appropriate routines. On the VAX,\^ the first two local
917: variables declared in a C routine \fIalways\fP get allocated to \*Mr10\fP and
918: \*Mr11\fP. Thus,\^ \*Mr10\fP and \*Mr11\fP are used for the \*Mgfp\fP
919: and the \*Mefp\fP respectively. If the target machine is like
920: the VAX in that it doesn't always save certain registers,\^ a similar
921: tactic may need to be used. If this is the case,\^ try compiling a
922: routine with a pair of \*Mregister int\fP variables declared and see
923: what the compiler does. If the compiler saves the two registers
924: assigned to the variables,\^ use those registers for the \*Mgfp\fP and
925: the \*Mefp\fP. It is wise to attempt to be sure that the compiler is
926: deterministic in making its choice of registers to allocate to the
927: variables. Routines that require this ruse to be employed have a line
928: containing the string \*MDclSave\fP as the first line of the
929: declarations. \*MDclSave\fP is defined in \*Mrt.h\fP and should be
930: set to an appropriate value. It may be the case that no registers
931: need to be saved. If so,\^ define \*MDclSave\fP,\^ but specify no value.
932: This is done for the PDP-11.
933: .PP
934: It is also necessary to select a register to use as the interpreter
935: program counter (\*Mipc\fP). Any general register that is preserved
936: across procedure calls is suitable. The VAX uses \*Mr9\fP for
937: the \*Mipc\fP.
938: .NH 2
939: Machine and System Specific Values
940: .PP
941: Edit \*Mh/rt.h\fP and search for the first \*MPORT\fP section.
942: Define the various constant values as outlined below.
943: .IP \*MMAXHEAPSIZE\fP
944: .br
945: The size of the heap storage region in bytes. The VAX uses 50k and
946: the PDP-11 uses 10k. If you have a small machine,\^ use 10k. Larger
947: machines should use larger values,\^ such as that for VAX.
948: .IP \*MMAXSTRSPACE\fP
949: .br
950: The size of the string storage region in bytes. As with
951: \*MMAXHEAPSIZE\fP,\^ this value is somewhat arbitrary. A value similar
952: to that used for the heap size should be used.
953: .IP \*MSTACKSIZE\fP
954: .br
955: The size of co-expression stacks in words. Use 1000 for smaller
956: machines,\^ 2000 for larger ones.
957: .IP \*MMAXSTACKS\fP
958: .br
959: The number of co-expression stacks initially allocated. Use 2 for
960: smaller machines,\^ 4 for larger ones.
961: .IP \*MNUMBUF\fP
962: .br
963: The number of i/o buffers available. When a file is opened,\^ a buffer
964: is assigned to the file if one is available. A value from 5 to
965: 10 is recommended.
966: .IP \*MINTSIZE\fP
967: .IP \*MLOGINTSIZE\fP
968: .IP \*MLONGS\fP
969: .IP \*MMINSHORT\fP
970: .IP \*MMAXSHORT\fP
971: .br
972: These constants must be set to the values they were given (if any) in
973: \*Mlink/ilink.h\fP.
974: .IP \*MMINLONG\fP
975: .br
976: The smallest value that can be represented in a \*Mlong\fP.
977: .IP \*MMAXLONG\fP
978: .br
979: The largest value that can be represented in a \*Mlong\fP.
980: .IP \*MLGHUGE\fP
981: .br
982: The highest base-10 exponent plus 1 representable by a \*Mfloat\fP.
983: For example,\^ on the VAX,\^ the highest number representable by a \*Mfloat\fP
984: is about 1.7x10\u38\d. Thus,\^ \*MLGHUGE\fP is 39 on the VAX.
985: .IP \*MFRAMELIMIT\fP
986: .br
987: As discussed above,\^ set \*MFRAMELIMIT\fP to the number of words
988: between the low word of the procedure frame marker and the word that
989: the procedure frame pointer references.
990: .IP \*MSTKBASE\fP
991: .br
992: This value represents the approximate base of the stack when
993: execution begins. On machines such as the VAX,\^ where the stack grows
994: down from high memory,\^ \*MSTKBASE\fP should have a high value,\^ where
995: on machines where the stack grows up from low memory,\^ \*MSTKBASE\fP
996: should have a low value. The \fIman\fP page for \fIexec(2)\fP usually
997: specifies the initial value for the stack pointer when program
998: execution begins. If uncertain,\^ be extreme with the value.
999: .IP \*MGRANSIZE\fP
1000: .br
1001: The granularity of memory allocations. Calls to \fIbrk(2)\fP are
1002: used to expand the main memory that is being used. When \fIbrk\fP
1003: is given an address to expand to,\^ it rounds it to a multiple of
1004: a certain number. That number should be used for \*MGRANSIZE\fP.
1005: The \fIman\fP page for \fIbrk(2)\fP should state what value is used
1006: on a particular system.
1007: .IP \*MDclSave\fP
1008: .br
1009: Give \*MDclSave\fP the value needed as previously described.
1010: .IP \*MEntryPoint(x)\fP
1011: .br
1012: \*MEntryPoint\fP is a macro that is used to yield the address of the
1013: first instruction of the C routine \*Mx\fP that is past any procedure
1014: entry protocol instructions. On the VAX,\^ the register mask is two
1015: bytes long and thus the first executable instruction of a routine \*Mx\fP
1016: is at \*M(char *)x + 2\fP. On the PDP-11,\^ there is a four-byte instruction
1017: at the start of each routine that calls the routine \*Mcsv\fP to
1018: save registers and establish the procedure frame. Thus for the
1019: PDP-11,\^ \*MEntryPoint(x)\fP is \*M(char *)x + 4\fP. Values calculated
1020: by \*MEntryPoint\fP are used in \*Minvoke\fP.
1021: .IP \*MDummyFcn(name)\fP
1022: .br
1023: Initially,\^ each of the assembly language
1024: portions of the system that must be filled in
1025: consist of a single line of the form \*MDummyFcn(name)\fP.
1026: \*MDummyFcn\fP should be defined to generate \fIassembly\fP language
1027: statements that
1028: form a dummy routine with the label \*Mname\fP. This can be as
1029: simple as a label and a global declaration. It is advisable to include
1030: as part of the definition something that will cause a program abort.
1031: A halt instruction usually does the job. Thus,\^ the system can be
1032: built and will function normally unless an incomplete routine is
1033: called.
1034: .IP \*MDummyDcl(x)\fP
1035: .br
1036: A macro that should expand into an assembly language declaration that
1037: allocates a word of storage for a variable named \*Mx\fP.
1038: .IP \*MDummyRef(x)\fP
1039: .br
1040: A macro that should expand into an assembly language reference to the
1041: variable \*Mx\fP.
1042: .IP \*MGlobal(x)\fP
1043: .br
1044: A macro that should expand into an assembly language
1045: declaration of \*Mx\fP as a global symbol.
1046: .IP \*Mfp\fP,\^\ \*Mefp\fP,\^\ \*Mgfp\fP,\^\ \*Mipc\fP
1047: .br
1048: It is advisable to use \*M#define\fPs for these registers rather than
1049: explicitly name them in the code.
1050: .IP \*Mcset_display\fP
1051: .br
1052: This is a rather complicated macro that is used to initialize the
1053: values of \*Mcset\fPs such as \*M&cset\fP and \*M&lcase\fP. If the
1054: target machine has \*Mint\fPs with 32 or 16 bits,\^ then the VAX or
1055: PDP-11 definition (respectively) of \*Mcset_display\fP may be used.
1056: If this is not the case,\^ \*Mcset_display\fP will have to be
1057: hand-crafted and the various uses of it will have to be altered for
1058: the machine in question. Briefly,\^ a \*Mcset_display\fP specifies
1059: which of the 256 bits that comprise a cset are to be set to 1.
1060: For example,\^ the \*Mcset_display\fP for \*M&cset\fP has all the bits
1061: set to 1,\^ while \*M&ascii\fP has the first 128 bits set to 1.
1062: \*Mcset\fPs are accessed using the \*Msetb\fP and \*Mtstb\fP macros
1063: which are also defined in \*Mrt.h\fP.
1064: \*Mcset_display\fPs appear in \*Miconx/init.c\fP,\^
1065: \*Mfunctions/bal.c\fP,\^ and \*Mfunctions/trim.c\fP. It may also be
1066: necessary to modify the definitions of \*MCSETSIZE\fP,\^ \*Msetb\fP,\^
1067: and \*Mtstb\fP.
1068: .PP
1069: Search for the second \*MPORT\fP section.
1070: \*MF_NQUAL\fP,\^ \*MF_VAR\fP,\^ \*MF_TVAR\fP,\^ \*MF_PTR\fP,\^ \*MF_NUM\fP,\^
1071: \*MF_INT\fP,\^ and \*MF_AGGR\fP should be given the
1072: same values they have in \*Mlink/datatype.h\fP.
1073: .PP
1074: Once \*Mrt.h\fP has been completed,\^ an analogous file,\^ \*Mh/defs.s\fP
1075: must tailored. \*Mdefs.s\fP is a subset of \*Mrt.h\fP that is
1076: included in assembly language files. The \*MPORT\fP section of \*Mdefs.s\fP
1077: lists a number of constants that must be defined. Use the appropriate
1078: values from \*Mrt.h\fP for each constant. If all assemblers used a
1079: default radix of 10 for constants,\^ it would be possible to tailor
1080: \*Mdefs.s\fP mechanically,\^ but since this is not the case,\^
1081: \*Mdefs.s\fP must be modified by hand.
1082: .NH 2
1083: Complete System Compilation
1084: .PP
1085: In order to determine if there are serious C compiler problems with
1086: the interpreter source,\^ the entire system should be made at this
1087: point. Do a
1088: .Ds
1089: make Icon
1090: .De
1091: in the root directory of the Icon distribution.
1092: The entire
1093: system should compile without any problems. The resulting
1094: interpreter will be completely disfunctional,\^ but if it is built without
1095: any problems,\^ it provides further evidence that the C compiler is
1096: up to the task.
1097: .NH 2
1098: Porting the Assembly Language Routines
1099: .PP
1100: The porting of the assembly language routines is the most difficult part of
1101: porting Icon. This document has a section for each assembly language routine
1102: and each routine is described in three ways:
1103: .Ds
1104: .ft R
1105: overview
1106: generic operation
1107: the routine on the VAX
1108: .De
1109: .PP
1110: The overview section briefly describes the action of the routine
1111: and how the routine may be encountered during the course of execution.
1112: The generic operation section tells what steps the routine
1113: takes to perform its given task. Each major step that the routine takes
1114: is described. These steps should be very similar from machine to
1115: machine.
1116: The section about the routine on the VAX details the
1117: operation of the routine on the VAX. This section complements
1118: the comments contained in the source code for the routine and should
1119: be read with the source code at hand.
1120: This section is very machine specific. (Ideally there would be one
1121: such section for each existing Icon implementation.)
1122: .PP
1123: Each routine must be formulated for the target machine. For the most
1124: part,\^ the best approach is to take the same steps that are
1125: taken on the VAX. It is important to select the right level for
1126: modeling the VAX routines. Try to recognize the steps that are
1127: made rather than following the operations on a per-instruction basis.
1128: The most important thing is to have a good understanding of what actions
1129: are performed and how these can be done on the target machine.
1130: .PP
1131: The first goal is to get a very simple Icon program working. This
1132: first program is \*Mtest/hello.icn\fP. It is quite short:
1133: .Ds
1134: procedure main()
1135: write("hello world")
1136: end
1137: .De
1138: The basis of routines mentioned above
1139: (\*Mstart.s\fP,\^ \*Msetbound.s\fP,\^ \*Minvoke.s\fP,\^ \*Minterp.s\fP,\^
1140: \*Mefail.s\fP,\^ and \*Mpfail.s\fP)
1141: must be implemented for even a very simple Icon program to work.
1142: However,\^ all these
1143: routines do not need to be written to make \*Mhello\fP \fIbegin\fP to work.
1144: .PP
1145: Translate and link \*Mhello\fP by running the translator and the
1146: linker:
1147: .Ds
1148: tran/itran hello.icn
1149: link/ilink hello.u1
1150: .De
1151: This creates an interpretable file named \*Mhello\fP. Just to
1152: get the feel of things,\^ run the interpreter on the file:
1153: .Ds
1154: iconx/iconx hello
1155: .De
1156: A message of some type and a core dump should be produced.
1157: .PP
1158: The files \*Mtran/itran\fP,\^ \*Mlink/ilink\fP,\^ and \*Miconx/iconx\fP,\^
1159: are copied into the \*Mbin\fP directory as the last action of
1160: .Ds
1161: make Icon
1162: .De
1163: in the root directory.
1164: The porter may find it convenient to link
1165: these files to the \*Mbin\fP directory and then place the full pathname
1166: in his search path. It is necessary to
1167: remove \*Mitran\fP,\^ \*Milink\fP,\^ and \*Miconx\fP in the \*Mbin\fP
1168: directory before linking them. Also,\^ if the files are linked,\^ the last
1169: step of \fImake\fP in the root directory will fail. This failure is
1170: inconsequential.
1171: .PP
1172: As \*Mstart.s\fP et al. are written,\^ try stepping through them to be
1173: sure the correct actions are being performed. Most of the assembly
1174: language source
1175: files are straight line code with a branch or two and it is possible to
1176: do a large amount of verification of the assembly code by single stepping
1177: through it with a debugger.
1178: .PP
1179: When a routine has been completed,\^ it may be added to the interpreter
1180: by doing a \fImake\fP in the directory containing the routine and
1181: then doing another \fImake\fP in the directory \*Miconx\fP.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.