Annotation of 43BSD/contrib/icon/docs/tr83-10a.roff, revision 1.1.1.1

1.1       root        1: .de Ls
                      2: .RS
                      3: .nr L 0 1
                      4: ..
                      5: .de Le
                      6: .RE
                      7: .LP
                      8: ..
                      9: .de Np
                     10: .IP (\\n+L) .25i
                     11: ..
                     12: .ds ar \v'2p'\s18\(->\s0\v'-2p'
                     13: .ds sd \s8\v'.2m'\h'-0.4n'
                     14: .ds su \v'-.2m'\s0
                     15: .ds ex \fIarg\fP
                     16: .ds e1 \fIarg\*(sd1\*(su\fP
                     17: .ds e2 \fIarg\*(sd2\*(su\fP
                     18: .ds e3 \fIarg\*(sd3\*(su\fP
                     19: .ds e4 \fIarg\*(sd4\*(su\fP
                     20: .ds e5 \fIarg\*(sd5\*(su\fP
                     21: .ds ei \fIarg\*(sdi\*(su\fP
                     22: .ds en \fIarg\*(sdn\*(su\fP
                     23: .ds e0 \fIarg\*(sd0\*(su\fP
                     24: .ds xx \fIexpr\fP
                     25: .ds x1 \fIexpr\*(sd1\*(su\fP
                     26: .ds x2 \fIexpr\*(sd2\*(su\fP
                     27: .ds x3 \fIexpr\*(sd3\*(su\fP
                     28: .ds x4 \fIexpr\*(sd4\*(su\fP
                     29: .ds x5 \fIexpr\*(sd5\*(su\fP
                     30: .ds xi \fIexpr\*(sdi\*(su\fP
                     31: .ds xn \fIexpr\*(sdn\*(su\fP
                     32: .ds x0 \fIexpr\*(sd0\*(su\fP
                     33: .ds v0 \fIvar\fP
                     34: .ds v1 \fIvar\*(sd1\*(su\fP
                     35: .ds v2 \fIvar\*(sd2\*(su\fP
                     36: .ds v3 \fIvar\*(sd3\*(su\fP
                     37: .ds vi \fIvar\*(sdi\*(su\fP
                     38: .ds vn \fIvar\*(sdn\*(su\fP
                     39: .ds ax \fIarg\fP
                     40: .ds a1 \fIarg\*(sd1\*(su\fP
                     41: .ds a2 \fIarg\*(sd2\*(su\fP
                     42: .ds a3 \fIarg\*(sd3\*(su\fP
                     43: .ds a4 \fIarg\*(sd4\*(su\fP
                     44: .ds a5 \fIarg\*(sd5\*(su\fP
                     45: .ds ai \fIarg\*(sdi\*(su\fP
                     46: .ds an \fIarg\*(sdn\*(su\fP
                     47: .ds a0 \fIarg\*(sd0\*(su\fP
                     48: .de St
                     49: .ta 1.0iR +.5i 4i
                     50: ..
                     51: .de S1
                     52: .ta 0.75i
                     53: ..
                     54: .de Pt
                     55: .ta 0.8i +0.8i +0.8i +0.8i +0.8i +0.8i +0.8i +0.8i
                     56: ..
                     57: .TR 83-10d
                     58: .DA "June 1983; Revised July 1983,\^ January 1984,\^ June 1984,\^ and August 1984"
                     59: .Gr
                     60: .TL
                     61: Porting the UNIX Implementation of Icon
                     62: .AU
                     63: William H. Mitchell
                     64: .AB
                     65: This document explains how to port the UNIX implementation of the
                     66: Icon programming language.  The Icon system is composed of a translator,\^
                     67: a linker,\^ and an interpreter.  Procedures for porting each system
                     68: component are described in detail.  This document is meant to be a
                     69: companion to the Icon ``tour'' (TR 84-11) and the source code for
                     70: the system.
                     71: .AE
                     72: .SH
                     73: Introduction
                     74: .PP
                     75: This document describes how to port the Version 5 Icon interpreter
                     76: to a \*U environment.
                     77: .Un
                     78: There is both an interpreter and a compiler available for Icon; this
                     79: document only addresses porting the interpreter.
                     80: The Icon system has three major components:
                     81: a translator,\^ a linker,\^ and an interpreter.  The translator and
                     82: the linker are entirely written in C and porting them is merely a
                     83: matter of setting constant values that are appropriate for the target
                     84: machine.  Portions of the interpreter are written in assembly language and
                     85: thus must be written anew for each machine.  The interpreter also
                     86: contains a very small amount of C code that must be written on a
                     87: per-machine basis.
                     88: .PP
                     89: The sections of this document that describe the porting of the
                     90: translator and the linker are straightforward,\^ being merely a
                     91: description of a process.  While porting the translator and the
                     92: linker is a task of following instructions,\^ porting the interpreter
                     93: is a task of design and programming.  The approach taken is to describe
                     94: what function each routine must perform and how it is implemented in
                     95: the VAX\u\(dg\d version of Icon.  The porter's job is to determine how to
                     96: .FS
                     97: \u\(dg\dVAX is a trademark of Digital Equipment Corporation.
                     98: .FE
                     99: implement the various routines on the target machine.
                    100: .PP
                    101: In light of
                    102: the increasing popularity of the C language and the availability of C
                    103: compilers for non-UNIX environments,\^ it is quite possible that one may
                    104: desire to port Icon to a non-UNIX environment.
                    105: Because the matter of porting a UNIX program to
                    106: a non-UNIX environment is a problem in itself,\^ it is not addressed in
                    107: this document.  Rather,\^ this document assumes that the target
                    108: environment is UNIX.  This is not to say that porting Icon to a non-UNIX
                    109: environment is not feasible.  Icon is not strongly bound to UNIX,\^ the
                    110: primary association being that Icon is written in C.  It is
                    111: anticipated that most C systems that are available for a non-UNIX
                    112: environment will provide most of the UNIX-independent C standard
                    113: functions as part of a library.  If such a library is available,\^
                    114: it should be possible to port Icon without great difficulty.
                    115: .PP
                    116: This document is a companion document of the Icon
                    117: ``tour''\^[1] and should be studied with the source code for
                    118: Version 5.8 of Icon at hand.
                    119: In particular,\^ the porter should be familiar with the information
                    120: contained in the tour.
                    121: .PP
                    122: The sections of this document that describe the VAX assembly language
                    123: code attempt to explain the operation of instructions when the
                    124: operation is not obvious.  However,\^ this document does
                    125: assume that the porter has
                    126: a rudimentary familiarity with the basic concepts of the VAX-11
                    127: architecture\^[2].
                    128: .SH
                    129: C Compiler Requirements
                    130: .PP
                    131: Because there is no
                    132: standard for the C programming language,\^ it is difficult to say how
                    133: ``standard'' the usage of C in the system is.  The system was
                    134: developed using the V7 C compiler,\^ often referred to as the Ritchie
                    135: compiler\^[3].  The system was later ported to the VAX using the \fIPortable
                    136: C Compiler\fP\^[4] and no serious problems were encountered.
                    137: .PP
                    138: In addition to supporting ``full'' C,\^ a few specific requirements and
                    139: non-requirements are made on the C compiler:
                    140: .Ls
                    141: .Np
                    142: The compiler must support both assignment and call-by-value for
                    143: structures.
                    144: .Np
                    145: The compiler need not support bit field operations.
                    146: .Np
                    147: Arguments to C functions must be stored in consecutive,\^
                    148: ascending memory locations.
                    149: .Np
                    150: There may be problems if \*Msizeof(int)\fP and \*Msizeof(char *)\fP
                    151: are not the same,\^ but no definite problems are known.
                    152: .Np
                    153: It is believed that there are great,\^ perhaps insurmountable
                    154: problems,\^ if \*Msizeof(char *)\fP is not equal to \*Msizeof(int *)\fP.
                    155: .Le
                    156: .SH
                    157: System Testing
                    158: .PP
                    159: The test programs and testing procedures to be used for porting Icon
                    160: are described in \^[5].
                    161: At various points in this document,\^ the porter is directed
                    162: to test the system component just completed.  At such times,\^ the
                    163: porter should refer to \^[5] to determine what should be done.
                    164: .NH 1
                    165: Porting the Icon Translator
                    166: .NH 2
                    167: Overview
                    168: .PP
                    169: The Icon translator,\^ known as \*Mitran\fR,\^ is the first
                    170: logical component of the Icon system.  The translator takes Icon
                    171: source files as input and produces two \fIucode\fR output files
                    172: for each input file.
                    173: The translator may be run by saying:
                    174: .Ds
                    175: itran hello.icn
                    176: .De
                    177: This produces two ascii files,\^
                    178: \*Mhello.u1\fR and \*Mhello.u2\fR.  \*Mhello.u1\fR
                    179: contains interpretable instructions and data in a printable
                    180: format.  \*Mhello.u2\fR contains information about global symbols and
                    181: scope.
                    182: .PP
                    183: The translator is written entirely in C and is the most machine
                    184: independent major system component.  No serious
                    185: problems should be encountered in porting it.  If difficulties are
                    186: encountered,\^ it probably indicates that there are major problems
                    187: with the C compiler being used.
                    188: .NH 2
                    189: Porting Procedure
                    190: .PP
                    191: The Icon system contains a number of instances of values that must be
                    192: specified on a per-machine basis.  The system also contains assembly
                    193: code and,\^ of course,\^ such code is different on each machine.  Rather
                    194: than maintaining a source copy of Icon for each machine that Icon runs
                    195: on,\^ C preprocessor control statements are used to select portions of
                    196: code specific to a certain machine.  The source as distributed can
                    197: be compiled on either a VAX and PDP-11* system by defining \*MVAX\fP or
                    198: .FS
                    199: *PDP is a trademark of Digital Equipment Corporation.
                    200: .FE
                    201: \*MPDP11\fP respectively in \*Mh/config.h\fP.  The porting source has
                    202: neither \*MVAX\fP or \*MPDP11\fP defined; rather,\^ \*MPORT\fP
                    203: is defined.  Where machine specific code is to appear,\^ along with
                    204: sections bracketed by \*M#define\fPs for \*MVAX\fP and \*MPDP11\fP,\^
                    205: there is a skeletal section bracketed by a \*M#define\fP for
                    206: \*MPORT\fP.  The \*MPORT\fP section is to be filled out for the
                    207: target machine.  This convention is followed throughout and porting
                    208: Icon is nothing more than filling in all the \*MPORT\fP sections.
                    209: .PP 
                    210: The source for the translator is contained in the directory \*Mtran\fP.
                    211: Translator machine dependencies are confined to the file \*Mtran/sym.h\fR.
                    212: A pair of constants define the sizes of two data structures used during the
                    213: translation process.
                    214: Edit the file \*Msym.h\fR and search for the string \*MPORT\fR.
                    215: The code looks something like
                    216: .Ds
                    217: .ta 2.0iR 2.5i
                    218: #ifdef PORT
                    219: #define TSIZE  x       /* default size of parse tree space */
                    220: #define SSIZE  x       /* default size of string space */
                    221: #endif PORT
                    222: #ifdef VAX
                    223: #define TSIZE  15000   /* size of parse tree space */
                    224: #define SSIZE  15000   /* default size of string space */
                    225: #endif VAX
                    226: #ifdef PDP11
                    227: #define TSIZE  5000    /* default size of parse tree space */
                    228: #define SSIZE  5000    /* default size of string space */
                    229: #endif PDP11
                    230: .De
                    231: The values of \*MTSIZE\fP and \*Mssize\fP are not critical
                    232: and current values have been chosen rather arbitrarily.
                    233: If you are on a large
                    234: machine,\^ use the values of \*MTSIZE\fR and \*MSSIZE\fR specified for
                    235: the VAX; otherwise,\^ use the values specified for the PDP-11.
                    236: .PP
                    237: The translator may now be compiled by issuing the \fImake\fR command
                    238: without any arguments.
                    239: .PP
                    240: It should be noted that although Icon programs are used to create
                    241: some of the translator source files (namely \*Mkeyword.h\fP,\^ \*Mkeyword.c\fP
                    242: \*Moptab.c\fP,\^ and \*Mtoktab.c\fP).  These files are machine independent
                    243: and do not need to be remade.  If for some reason \*Mmake\fP tries to
                    244: create any of these files,\^ just \*Mtouch\fP the file in question to
                    245: update the last-modified date.  Similarly,\^ \*Mparse.c\fP is generated by
                    246: \fIyacc\fP and does not need to be regenerated unless the grammar is
                    247: modified.
                    248: .PP
                    249: When the translator has been successfully compiled using \fImake\fP,\^
                    250: refer to [5] for testing.
                    251: .PP
                    252: Porting the translator may seem like a trivial task,\^
                    253: but its successful completion is a definite milestone because
                    254: it is good sign that the C compiler in use is suitable.
                    255: .nr $1 1
                    256: .nr $2 0
                    257: .NH 1
                    258: Porting the Icon Linker
                    259: .NH 2
                    260: Overview
                    261: .PP
                    262: The Icon linker,\^ known as \*Milink\fP,\^ is the second logical component
                    263: of the Icon system.  The linker takes \*Mu1\fP and \*Mu2\fP files
                    264: produced by the translator and binds them together to form an
                    265: \fIinterpretable\fP file.  The interpretable file serves as input
                    266: for the Icon interpreter.
                    267: The linker is written entirely in C and is a fairly small and
                    268: simple program.  However,\^ the interpretable files produced by
                    269: the linker are not machine independent and because of this,\^
                    270: porting the linker is more troublesome than porting the
                    271: translator.
                    272: .PP
                    273: Interpretable files contain two distinct types of data: opcodes and
                    274: associated operands that the interpreter ``understands''; and data that
                    275: is directly mapped into run-time data structures.  By ``mapping'' it
                    276: is meant that the data is loaded into memory and then C structure
                    277: references are used to access elements of the object at a certain
                    278: location in memory.
                    279: The formats of
                    280: the opcodes and operands must conform to what the interpreter is
                    281: expecting.  The data that is directly mapped must conform to the
                    282: format of the C data structures used by the run-time system.
                    283: .PP
                    284: On the VAX,\^ for example,\^ interpreter opcodes are one byte long
                    285: and operands are four bytes long.  On the PDP-11,\^ opcodes are
                    286: also one byte long,\^ but operands are only two bytes long.
                    287: Opcode and operand size are fairly arbitrary,\^ but it is important
                    288: that the linker and the interpreter be coordinated.  
                    289: .PP
                    290: The mapped data structures are slightly more complicated because
                    291: the linker must conform to the format produced by the C compiler.
                    292: This is not difficult,\^ since the data structures involved have
                    293: a regular form.  All are composed of some number of \fIwords\fP
                    294: where each word is the same size in every structure.*
                    295: .FS
                    296: *Literature about the VAX conventionally uses the term \fIword\fP to refer
                    297: to 16-bit quantities and the term \fIlongword\fP to refer to 32-bit
                    298: quantities.  In this document,\^ \fIword\fP in a generic context
                    299: refers to the basic unit of the run-time data structures; \fIword\fP
                    300: in a VAX-specific context refers to a 32-bit quantity.
                    301: .FE
                    302: .PP
                    303: The opcodes,\^ operands,\^ and mapped data are accumulated in memory during the
                    304: linking process.  This conglomerate is referred to as the \fIcode\fP
                    305: section.  Several routines are used to add data to the code
                    306: section.  These routines are parameterized so that porting the linker
                    307: to a new machine is merely a matter of setting the parameters
                    308: correctly.  Four primitive data units
                    309: compose the code section.  These are \fIopcodes\fP,\^ \fIoperands\fP,\^
                    310: \fIwords\fP,\^ and \fIblocks\fP.
                    311: .IP opcodes
                    312: .br
                    313: are instructions for the
                    314: interpreter.  An opcode may direct the interpreter to push a value
                    315: on the stack,\^ branch to a location,\^ perform an arithmetic operation,\^
                    316: etc.
                    317: .IP operands
                    318: .br
                    319: are associated with some opcodes.  For
                    320: example,\^ the \*Mgoto\fP instruction has a location to branch to as
                    321: its single operand.
                    322: .IP words
                    323: .br
                    324: compose mapped data structures.  A word is the basic
                    325: unit of the run-time data structures and should consist
                    326: of \*Msizeof(int *)\fP bytes.
                    327: .IP blocks
                    328: .br
                    329: are merely some number of bytes.  For example,\^ a \*Mcset\fP constant
                    330: is loaded into the code section as a block of 32 8-bit bytes (256 bits).
                    331: .PP
                    332: Routines in \*Mlink/lcode.c\fP are used to add a unit
                    333: of data of one of the preceding types to the code section.  These
                    334: routines are \*Moutop\fP,\^ \*Moutopnd\fP,\^ \*Moutword\fP,\^ and
                    335: \*Moutblock\fP.  Each routine adds the appropriate data into
                    336: the code section at the current location (maintained as a pointer),\^
                    337: and then the location pointer is advanced to the next free location.
                    338: .NH 2
                    339: Porting Procedure
                    340: .PP
                    341: Edit \*Milink.h\fP and search for the string \*MPORT\fP.  Define
                    342: the following constants as described.
                    343: .IP \*MINTSIZE\fP
                    344: .br
                    345: The number of bits in an \*Mint\fP.
                    346: .IP \*MLOGINTSIZE\fP
                    347: .br
                    348: The base 2 log of \*MINTSIZE\fP.
                    349: That is,\^ \*MLOGINTSIZE\fP answers
                    350: the question ``\fIWhat power of 2 is \*MINTSIZE\fR\^?''.
                    351: .IP \*MLONGS\fP
                    352: .br
                    353: Icon has an integer data type.  On the VAX and the PDP-11 the range
                    354: of integer values is \-2\u\s-231\s0\d to 2\u\s-231\s0\d-1.  On the VAX,\^
                    355: C \*Mint\fPs and \*Mlong\fPs are both 32 bits wide.  On the PDP-11,\^
                    356: C \*Mint\fPs are 16 bits wide while \*Mlong\fPs are 32 bits wide.
                    357: The PDP-11 Icon system makes an internal distinction between integers
                    358: that ``fit'' in 16 bits and integers that require 32 bits.
                    359: The former are stored in two-word descriptors (the actual value being
                    360: in the second of the two 16-bit words),\^ while the latter have a
                    361: value descriptor that points to a block in the heap that holds the
                    362: two-word,\^ 32-bit value.  On the other hand,\^ the VAX uses two 32-bit words
                    363: for descriptors and thus the second word of a descriptor can hold
                    364: the largest possible integer value used by Icon.
                    365: Rather than having an internal distinction between integer
                    366: types on the VAX,\^ integers are always represented by two-word
                    367: integer descriptors.  There are places in the code where special
                    368: provisions must be made if C \*Mint\fPs are not the same size
                    369: as C \*Mlong\fPs.
                    370: .sp
                    371: If \*Msizeof(int) != sizeof(long)\fP for the C compiler in use,\^
                    372: define \*MLONGS\fP.  (\*MLONGS\fP need not be given a value,\^
                    373: \*M#define LONGS\fP is sufficient.)
                    374: If \*MLONGS\fP must be defined,\^ the minimum
                    375: and maximum values that can be represented by an \*Mint\fP must also
                    376: be defined.  Define \*MMINSHORT\fP to be the smallest value that an
                    377: \*Mint\fP can hold and define \*MMAXSHORT\fP to be the largest value that
                    378: an \*Mint\fP can hold.
                    379: .IP \*MMAXCODE\fP
                    380: .br
                    381: This is the maximum size in bytes of the code that can be generated for each
                    382: procedure.  This value is not critical; 10,\^000 is used for the VAX,\^
                    383: while 2000 is used for the PDP-11.
                    384: .IP \*Mstrchr\fP\ and\ \*Mstrrchr\fP
                    385: .br
                    386: If you are on a USG UNIX system,\^ \*M#define\fP \*Mindex\fP to be
                    387: \*Mstrchr\fP and \*Mrindex\fP to be \*Mstrrchr\fP.
                    388: .PP
                    389: Edit \*Mdatatype.h\fP and
                    390: search for the \*MPORT\fP section.  This section contains
                    391: \*M#define\fPs that are used to set and test flags contained in the
                    392: first word of descriptors.  The basic idea in forming these
                    393: constants is to set some bits at the high end of the word,\^ and set some
                    394: other bits at the low end.  The number of unused bits in the middle
                    395: depends on the size of a word.
                    396: .PP
                    397: \*MF_NQUAL\fP,\^ \*MF_VAR\fP,\^ \*MF_TVAR\fP,\^ \*MF_PTR\fP,\^ \*MF_NUM\fP,\^
                    398: \*MF_INT\fP,\^ and \*MF_AGGR\fP should be set to mask values with one
                    399: bit set to 1 in each.  For \*MF_NQUAL\fP,\^ the leftmost bit should set,\^
                    400: for \*MF_VAR\fP,\^ the next to leftmost bit should be set,\^ and so
                    401: forth.  The values for the VAX and PDP-11 should be suitable for
                    402: machines with 32-bit and 16-bit words,\^ respectively.
                    403: .PP
                    404: The constants
                    405: \*MOPSIZE\fP,\^ \*MOPNDSIZE\fP,\^ and \*MWORDSIZE\fP control the
                    406: sizes of opcodes,\^ operands,\^ and words in the code section.
                    407: Before setting these constants to values appropriate for the
                    408: target machine,\^ a ``standard'' linker should be built and tested using
                    409: the supplied values (under \*MPORT\fP) for these constants.
                    410: This allows the linker to be checked against output files that
                    411: are known to be correct.  The purpose of this is to attempt to
                    412: discover C compiler problems.  Compile the linker using \*Mmake\fP and
                    413: refer to
                    414: [5] for the testing procedure.
                    415: .PP
                    416: Once the ``standard'' linker has been checked out,\^ the following
                    417: ``sizing'' parameters in \*Milink.h\fP should be set to values
                    418: appropriate for the target machine.
                    419: .IP \*MOPSIZE\fP
                    420: .br
                    421: This is the size in bytes of interpreter opcodes.  The interpreter treats
                    422: opcodes as unsigned quantities.  One byte (8 bits) is currently
                    423: large enough to accommodate all opcodes and a value of 1 is recommended
                    424: for \*MOPSIZE\fP.  The \*Moutop\fP routine in \*Mlcode.c\fP assumes that
                    425: opcodes are one byte.  If a larger size is desired,\^ \*Moutop\fP
                    426: will have to be recoded.  It might be wise to use a value other than
                    427: 1 for \*MOPSIZE\fP on machines that are not byte-addressable and have
                    428: ample memory.
                    429: .IP \*MOPNDSIZE\fP
                    430: .br
                    431: This is the size in bytes of operands for interpreter instructions.
                    432: For some instructions,\^ the operand value represents an offset
                    433: from the interpreter program counter and thus,\^ the maximum possible
                    434: offset is limited by the magnitude of values that can be represented
                    435: in \*MOPNDSIZE\fP bytes.  Because larger operands occupy more code
                    436: space and smaller operands limit addressing ``distance'',\^ a trade-off
                    437: is involved.  On the VAX,\^ operands are four bytes because memory
                    438: space is not very critical.  On the PDP-11,\^ operands are two bytes
                    439: because of the limited memory.  While it is easy to change the
                    440: value of \*MOPNDSIZE\fP in the linker,\^ the operand size is pervasive
                    441: in the interpreter.  If the target machine has a large,\^ perhaps
                    442: virtual address space,\^ use a value such as 4 for \*MOPNDSIZE\fP.  A
                    443: value such as 2 may be appropriate for a smaller machine.  A value
                    444: of 1 is not advisable under any circumstances.  The suggested value
                    445: for \*MOPNDSIZE\fP is \*Msizeof(int)\fP.
                    446: .IP \*MWORDSIZE\fP
                    447: .br
                    448: This should be set to \*Msizeof(int *)\fP on the target machine.  The
                    449: various run-time data structures are all composed of a number
                    450: of words each of which contain \*MWORDSIZE\fP bytes.  For example,\^
                    451: the data blocks for user-defined procedures are built in the code
                    452: section by a sequence of calls to \*Moutword\fP.
                    453: .PP
                    454: The \*Mbackpatch\fP routine in \*Mlcode.c\fP needs some
                    455: machine-specific modifications.  This routine backpatches forward
                    456: references to ucode labels.  In the \fIwhile\fP loop,\^ the operand (which is
                    457: \*MOPNDSIZE\fP bytes long) that is pointed at by \*Mq\fP is loaded
                    458: into the variable \*Mp\fP.  Then,\^ the operand is replaced by the
                    459: value of \*Mr\fP.  On the VAX,\^ this can be expressed as:
                    460: .Ds
                    461: p = *q;
                    462: *q = r;
                    463: .De
                    464: where \*Mq\fP is an \*Mint *\fP.  This is possible because the VAX allows
                    465: word references on an arbitrary boundary.  On the PDP-11,\^ such
                    466: references are illegal and the assignments must be made on a byte-wise
                    467: basis.  If the target machine allows word accesses on arbitrary
                    468: boundaries,\^ the VAX code may be used (assuming \*MOPNDSIZE\fP is equal to
                    469: \*Msizeof(int)\fP).
                    470: If not,\^ but operands are the same size as \*Mint\fPs,\^
                    471: the PDP-11 code may be used.
                    472: Other situations may require ingenuity.  Be sure to alter the first \*MPORT\fP
                    473: section in \*Mbackpatch\fP to contain an appropriate declaration for
                    474: \*Mq\fP (that section currently contains a declaration for \*Mq\fP and
                    475: a \*Mreturn\fP).
                    476: .PP
                    477: When the linker has been compiled,\^
                    478: refer to [5] for directions on
                    479: testing.
                    480: .nr $1 2
                    481: .nr $2 0
                    482: .NH 1
                    483: Porting the Icon Interpreter
                    484: .NH 2
                    485: Introduction
                    486: .PP
                    487: The Icon interpreter,\^ known as \fIiconx\fP,\^ is the third major logical
                    488: component of the system.  The interpreter takes 
                    489: interpretable files produced by the linker and ``executes'' them.
                    490: The interpreter is run by:
                    491: .Ds
                    492: iconx hello
                    493: .De
                    494: where \*Mhello\fP has been produced by the linker.
                    495: .PP
                    496: Due to the stack manipulations that the interpreter performs,\^ it is
                    497: necessary for a small portion of the interpreter to be written in
                    498: assembly language rather than in C.  On the VAX,\^ about 550 lines of
                    499: assembly instructions are required.  The coding of these assembly
                    500: instructions is the most difficult part of the
                    501: port.
                    502: .NH 2
                    503: Source File Layout
                    504: .LP
                    505: The interpreter is divided into four parts:
                    506: .DS
                    507: .ft R
                    508: start-up code
                    509: the main loop
                    510: primary subroutines
                    511: support subroutines
                    512: .DE
                    513: .LP
                    514: The start-up code initializes the interpreter and passes control
                    515: to the main loop.  The main loop,\^ referred to as \*Minterp\fP,\^
                    516: fetches interpreter instructions
                    517: and executes them.  An interpreter instruction may be entirely
                    518: performed by \*Minterp\fP or \*Minterp\fP may call a \fIprimary
                    519: subroutine\fP to perform the operation.  In turn,\^ a primary subroutine
                    520: may call a number of \fIsupport subroutines\fP.
                    521: Each primary subroutine has a direct correspondence to a source language
                    522: operation of some type or to a stack manipulation.
                    523: .PP
                    524: While the translator and linker sources files are in their own
                    525: directories,\^ the interpreter source files are segregated into several
                    526: directories.
                    527: .nr a  \w'\*Moperators\fR'+1m
                    528: .IP \*Miconx\fP (\na)u
                    529: The start-up code and the main interpreter loop reside in this
                    530: directory.  Files of particular interest are: \*Mstart.s\fP,\^ which
                    531: is entered when the interpreter is run and does some low-level
                    532: initialization; \*Minit.c\fP,\^ which is called from \*Mstart.s\fP and
                    533: completes initialization of the interpreter; and \*Minterp.s\fP,\^ which
                    534: is the interpreter loop itself.
                    535: .IP \*Mfunctions\fP (\na)u
                    536: This directory contains code for the built-in procedures.
                    537: For example,\^ \*Mwrite.c\fP contains
                    538: the source for the \*Mwrite\fP function.  The source for each
                    539: built-in procedure appears in a file of its own.
                    540: .IP \*Moperators\fP (\na)u
                    541: This directory contains code for the Icon operators.  The routines in
                    542: this directory
                    543: implement the various Icon source level operators.  For
                    544: example,\^ \*Mplus.c\fP is called to perform the \*M+\fP (addition)
                    545: operation,\^ and \*Mbang.c\fP is called to perform the \*M!\fP (element
                    546: generation) operation.  As with the built-in procedures,\^ there is
                    547: one operator per file.
                    548: .IP \*Mlib\fP (\na)u
                    549: This directory contains routines that
                    550: do not fit anywhere else.  First of all,\^ there is
                    551: code for routines that perform actions similar in nature to those in
                    552: \*Mfunctions\fP and \*Moperators\fP,\^ but that do not have a functional or
                    553: operator syntax.  For example,\^ \*Mllist.c\fP creates a list
                    554: that is specified syntactically as
                    555: \*M\^[\*(e0,\^\*(e1,\^\*(El,\^\*(en]\fR,\^ and \*Mfield.c\fP handles record element
                    556: accesses that arise from \*M\*(e1.\*(e2\fR.
                    557: .sp .8
                    558: \*Mlib\fP also contains routines such as \*Mesusp.s\fP and \*Mefail.s\fP
                    559: that handle stack manipulations during expression evaluation.  The
                    560: routines \*Mpret.s\fP and \*Mpfail.s\fP handle procedure return and
                    561: failure respectively.
                    562: .sp .8
                    563: The directories \*Mfunctions\fP,\^ \*Moperators\fP,\^ and \*Mlib\fP
                    564: compose the primary subroutines mentioned above.
                    565: .IP \*Mrt\fP (\na)u
                    566: The support subroutines are contained in the \*Mrt\fP directory.
                    567: The primary subroutines are autonomous with respect to each other and
                    568: use the \*Mrt\fP routines for common operations.  For example,\^
                    569: \*Mcvstr.c\fP is used to convert a value to a string,\^ \*Mtrace.c\fP
                    570: produces various types of tracing messages,\^ and \*Mgc.c\fP is the
                    571: garbage collector.
                    572: .IP \*Mh\fP (\na)u
                    573: This directory contains a number of header files that are
                    574: \*M#include\fPd in the other files that compose the interpreter.
                    575: Of particular interest is \*Mrt.h\fP,\^ which defines a number of
                    576: constants and data structures.
                    577: .NH 2
                    578: Overview of the Porting Process
                    579: .PP
                    580: The following steps are to be followed when porting the interpreter.
                    581: .Ls
                    582: .Np
                    583: Determination of layout of procedure,\^ generator,\^ and expression
                    584: markers and associated frame pointers.
                    585: .Np
                    586: Setting of implementation specific constants in \*Mh/rt.h\fP and
                    587: creation of \*Mh/defs.s\fP from \*Mrt.h\fP.
                    588: .Np
                    589: Complete system compilation.
                    590: .Np
                    591: Coding of a ``basis'' of routines for the interpreter,\^ consisting of
                    592: \*Miconx/start.s\fP,\^ \*Mrt/setbound.s\fP,\^ \*Mlib/invoke.s\fP,\^
                    593: \*Miconx/interp.s\fP,\^ \*Mlib/efail.s\fP,\^
                    594: \*Mlib/pfail.s\fP.
                    595: .Np
                    596: Testing of the basis routines for the interpreter.
                    597: .Np
                    598: Coding and testing of
                    599: .Ds
                    600: rt/arith.s
                    601: rt/fail.s
                    602: lib/pret.s
                    603: lib/esusp.s
                    604: lib/lsusp.s
                    605: lib/psusp.s
                    606: rt/suspend.s
                    607: functions/display.c
                    608: .De
                    609: in an incremental fashion.  Test programs are provided
                    610: to test the system after adding each routine.
                    611: .Np
                    612: Coding of \*Mrt/gcollect.s\fP and \*Mrt/sweep.c\fP.
                    613: Testing of garbage collection.
                    614: .Np
                    615: Complete system testing.
                    616: .Le
                    617: .PP
                    618: This document does not explain how to port the sections of the system
                    619: that are related to co-expressions.  The involved files are
                    620: \*Mlib/coact.s\fP,\^ \*Mlib/cofail.s\fP,\^ \*Mlib/coret.s\fP,\^
                    621: \*Mlib/create.c\fP,\^ and \*Moperators/refresh.c\fP.  Icon
                    622: works properly with these sections of code left unimplemented,\^
                    623: provided no attempt is made to use co-expressions,\^ in which case
                    624: the system notes it as a fatal error.
                    625: .NH 2
                    626: Porting Procedure
                    627: .SH
                    628: Determination of Frame Layouts
                    629: .PP
                    630: Unfortunately,\^ one of the most far-reaching decisions that must be
                    631: made during the porting process is also one of the first decisions
                    632: that must be made.  The decision (actually,\^ a number of decisions) is
                    633: how to layout the procedure,\^ generator,\^ and
                    634: expression frames and what registers should be used as frame
                    635: pointers.  The various frames and their usages are explained in
                    636: detail in \^[1] and the portions of this document that describe
                    637: routines that manipulate a particular frame also provide further
                    638: explanations.  The porter should have a good
                    639: understanding of what the frames are used for before setting frame
                    640: layouts as they are pervasive throughout the assembly language
                    641: portions of the system.
                    642: .PP
                    643: This document is rather tightly bound to the VAX implementation of
                    644: Icon.  Because of this,\^ the stack model that is used is that
                    645: of the VAX.  Specifically,\^ the VAX stack starts in high memory and
                    646: grows downward.  Thus,\^ when something is pushed on the stack,\^ the
                    647: stack pointer goes down.  When something is removed,\^ the stack pointer
                    648: goes up.
                    649: The only time that this convention is departed from is in the use of
                    650: the phrase ``the top of the stack''.  The top of the stack is the
                    651: stack word that has the \fIlowest\fP memory address.
                    652: .PP
                    653: The procedure frame layout is the first to be determined.  The layout
                    654: is somewhat fixed by the C compiler and target machine,\^
                    655: so the task is a combination
                    656: of making a decision and also recognizing what has been pre-determined.
                    657: On most machines,\^ the
                    658: task of the porter is more one of recognition than of design.
                    659: .PP
                    660: The first thing to determine is the
                    661: frame layout imposed by the target machine and the C compiler.
                    662: Create a file containing the following
                    663: .Ds
                    664: f()
                    665: {
                    666:    x(1,\^2);
                    667: }
                    668: .De
                    669: Compile the file using \fIcc\fP
                    670: in such a manner as to catch the assembly code that is
                    671: generated in a file.  The \*M\-S\fP option of \fIcc\fP should cause
                    672: assembly code to be placed in a file.  On the VAX,\^ the code
                    673: generated by \*Mx(1,\^2)\fP is
                    674: .Ds
                    675: .ta .7i
                    676: pushl  $2
                    677: pushl  $1
                    678: calls  $2,\^_x
                    679: .De
                    680: From this it can be seen that arguments are pushed on the stack
                    681: using the \*Mpushl\fP instruction,\^ and that the \*Mcalls\fP
                    682: instruction does the actual procedure call.  The first argument
                    683: to \*Mcalls\fP is the number of arguments that are on the stack.
                    684: When a return is made from a procedure called with a \*Mcalls\fP
                    685: instruction,\^ the arguments are removed from the stack by the return
                    686: mechanism.  On some machines,\^ the removal of arguments after a subroutine
                    687: call is left to the programmer (or code generator,\^ in this case).
                    688: This is usually done by adding a value to the stack pointer or
                    689: incrementing the stack pointer several times.
                    690: .PP
                    691: Examine the assembly code produced on the target machine by the
                    692: given C statements.  Determine what actions are taken by the machine
                    693: when the appropriate call instruction is performed.  It is important
                    694: to completely and totally understand what the target machine does
                    695: when a call is performed.
                    696: Next,\^ determine what sort of procedure frame is used by C routines.
                    697: Compile the following C function using \*M\-S\fP.
                    698: .Ds
                    699: .ta .7i
                    700: f(a,\^b,\^c)
                    701: int a; char b; char *c;
                    702: {
                    703:    int x,\^y;
                    704: 
                    705:    x = a;
                    706:    a = 1;
                    707:    y = 2;
                    708: }
                    709: .De
                    710: Look at the generated
                    711: code and try to get a feel for what is going on.  The things that
                    712: need to be determined are:
                    713: .Ds
                    714: .ft R
                    715: how arguments are accessed
                    716: the format of the C call frame
                    717: register saving and restoring conventions
                    718: .De
                    719: For example,\^ on the VAX,\^ the following code is generated
                    720: for the test procedure.
                    721: .Ds
                    722: .Pt
                    723:        .word   L12             register save mask,\^ filled in later
                    724:        jbr     L14             jump to end to make stack space
                    725: L15:
                    726:        movl    4(ap),\^-4(fp)          x = a
                    727:        movl    $1,\^4(ap)              a = 1   
                    728:        movl    $2,\^-8(fp)             y = 2
                    729:        ret                     return
                    730:        .set    L12,\^0x0               set register mask
                    731: L14:
                    732:        subl2   $8,\^sp         make room for two local variables
                    733:                                   of four bytes each
                    734:        jbr     L15             jump to start of routine
                    735: .De
                    736: Several inferences can be made.  First of all,\^ arguments are accessed
                    737: relative to \*Map\fP,\^ the argument pointer.  Secondly,\^ local variables
                    738: are accessed relative to \*Mfp\fP,\^ the frame pointer.  On the VAX,\^
                    739: because of the hardware register save and restoration based on the
                    740: entry mask (the first word of the routine),\^ no subroutine calls are
                    741: required to save registers.
                    742: .PP
                    743: The Icon procedure frame must have the following attributes:
                    744: .Ls
                    745: .Np
                    746: The values on the stack at the time of call to the procedure appear
                    747: as arguments to the procedure.  Furthermore,\^ the values must be
                    748: accessible in a deterministic fashion.
                    749: .Np
                    750: Register values are saved in the frame and can be accessed deterministically.
                    751: .Np
                    752: \*M_line\fP and \*M_file\fP appear in the procedure frame just below
                    753: the last word pushed on the stack as part of the C procedure calling
                    754: protocol.
                    755: .Np
                    756: The region for local variables begins at the lower end of the
                    757: ``constant'' portion of the frame.  Local variables must be
                    758: be accessible via deterministic means.
                    759: .Np
                    760: The procedure frame created by a C procedure call must be a subset
                    761: of the procedure frame selected.  That is,\^ the Icon procedure frame
                    762: must be an augmentation of the C procedure frame.
                    763: .Le
                    764: .LP
                    765: The VAX uses this procedure frame layout:
                    766: .Ds
                    767: .St
                    768: .ft R
                    769:                arguments
                    770:        4       number of arguments (\*Mnargs\fR)
                    771: \*Map\fR \*(ar 0       number of words in argument list (\*Mnwords\fR)
                    772:                saved \*Mr11\fR (\*Mefp\fR)
                    773:                saved \*Mr10\fR (\*Mgfp\fR)
                    774:                \*(El
                    775:                last saved register
                    776:        16      saved \*Mpc\fR
                    777:        12      saved \*Mfp\fR
                    778:        8       saved \*Map\fR
                    779:        4       program status word and register mask
                    780: \*Mfp\fR \*(ar 0       0 (condition handler address)
                    781:        -4      saved value of \*M_line\fR
                    782:        -8      saved value of \*M_file\fR
                    783: \*Msp\fR \*(ar         Icon local variables
                    784: .De
                    785: .PP
                    786: Actually,\^ on the VAX,\^ most of the decisions are predetermined by
                    787: the VAX architecture.  The arguments are present on the stack,\^ so they
                    788: are the high end of the frame.  The registers are saved on the stack
                    789: by the \*Mcalls\fP instruction.  The values of \*M_line\fP and \*M_file\fP
                    790: naturally fit after the saved registers.  The locals then appear on
                    791: the lower end and extend for a variable distance (on a per-procedure
                    792: basis).  Note that the first local is at \*M\-16(fp)\fP and the \fIlast\fP
                    793: argument is at \*M8(ap)\fP.
                    794: .PP
                    795: The VAX hardware takes care of saving and restoring registers upon
                    796: subroutine entry and exit.  It is quite possible that the target
                    797: machine will not have this capability and the task must be delegated
                    798: to software.  This is usually evidenced by a call to a routine with a
                    799: name such as \*Mcsave\fP as the very first thing in the routine and a
                    800: call to a routine with a name such as \*Mcrestore\fP at the end of a
                    801: routine.  If this is the case,\^ the actions of
                    802: the saving and restoring routines must be taken into account when determining
                    803: the procedure frame layout.
                    804: .PP
                    805: In addition to determining the procedure frame layout,\^ a procedure
                    806: frame pointer must also be selected.  On the VAX,\^
                    807: the \*Mfp\fP stays constant throughout execution of a C procedure;
                    808: it is used as the procedure frame pointer.  For the target machine,\^
                    809: there should be some register on which references to local variables
                    810: (and perhaps parameters) are based.  That register should be used
                    811: as the procedure frame pointer (sometimes referred to as the
                    812: \*Mpfp\fP).
                    813: The \*Mpfp\fP need not point at the lowest word pushed on the stack
                    814: as part of the procedure call; it only needs to be constant while
                    815: a procedure is executing.  Of course,\^ the \*Mpfp\fP changes while the
                    816: program is executing; by ``pointing at'' a particular word,\^ it is
                    817: meant that the \*Mpfp\fP always references a certain word in the
                    818: procedure frame marker.  An \*Mrt.h\fP constant,\^ \*MFRAMELIMIT\fP,\^ is
                    819: dependent on the number of words between the lowest word of the
                    820: procedure marker and the word that the \*Mpfp\fP points to.  Setting
                    821: \*MFRAMELIMIT\fP is described below.
                    822: .PP
                    823: A point about terminology should be stressed.  The procedure frame marker is
                    824: bounded by arguments on one end and the Icon local variables on the other.
                    825: A procedure marker,\^ the arguments,\^ the Icon
                    826: local variables,\^ and the stack below
                    827: the local variables compose a procedure frame.
                    828: .\"\^[Note in here about variable size of marker,\^ forced saving of efp,\^ gfp.,\^
                    829: .\"ap being needed]
                    830: .PP
                    831: Determining the procedure frame layout
                    832: is by no means a deterministic process.  It takes work,\^ but once
                    833: it's successfully set,\^ the single hardest task of the port is complete.
                    834: .PP
                    835: Once the procedure frame has been set,\^ the generator frame layout follows
                    836: rather easily.  A generator frame is merely an augmented procedure
                    837: frame.  The generator frame has two additional pieces of information,\^
                    838: a saved value of \*M_k_level\fP,\^ and a saved value for the boundary.
                    839: It is recommended that the generator frame
                    840: be identical to a procedure frame except that the two extra words
                    841: required be located between the lowest word that is pushed on the
                    842: stack by the procedure call mechanism and the saved value of \*M_line\fP.
                    843: Thus,\^ on the VAX,\^ the generator frame \fImarker\fP is
                    844: .Ds
                    845: .ft R
                    846: .St
                    847:                saved \*Mr11\fR
                    848:                saved \*Mr10\fR
                    849:                \*(El
                    850:                last saved register
                    851:        20      reactivation address
                    852:        16      saved \*Mfp\fR
                    853:        12      saved \*Map\fR
                    854:        8       program status word and register mask
                    855:        4       0 (condition handler address)
                    856: \*Mgfp\fR \*(ar        0       saved value of the boundary
                    857:        -4      saved value of \*M_k_level\fR
                    858:        -8      saved value of \*M_line\fR
                    859:        -12     saved value of \*M_file\fR
                    860: .De
                    861: Note that instead of a saved \*Mpc\fR value,\^ the generator frame marker
                    862: holds a reactivation address.  Control passes to this address when
                    863: the generator is reactivated.  Reactivation is fully explained in
                    864: later sections.
                    865: .PP
                    866: A generator frame pointer (\*Mgfp\fP) is associated with a generator
                    867: frame.  On the VAX,\^ \*Mr10\fP is the \*Mgfp\fP.  The choice of a \*Mgfp\fP
                    868: is indirectly determined by the machine architecture and is
                    869: intertwined with the selection of an expression frame pointer.
                    870: The selection of the register to use as the \*Mgfp\fP is discussed below.
                    871: It is recommended that the \*Mgfp\fP point
                    872: at the word containing the saved boundary value.
                    873: .PP
                    874: The third type of frame marker is the expression frame marker.
                    875: Expression frame markers are totally machine independent and contain
                    876: three pieces of information: a saved expression marker address,\^
                    877: a saved generator marker address,\^ and a failure label that is to
                    878: be given control in certain circumstances.  On the VAX,\^ the
                    879: expression marker layout is
                    880: .Ds
                    881: .ft R
                    882: .St
                    883: \*Mefp\fR \*(ar        0       saved \*Mefp\fR value
                    884:        -4      saved \*Mgfp\fR value
                    885:        -8      failure address
                    886: .De
                    887: This same format should be used on the target machine and there is no
                    888: apparent reason for needing an alternative format.  The expression
                    889: frame pointer (\*Mefp\fP) should point at the high word of the
                    890: expression marker.
                    891: .PP
                    892: The registers that should be used for the \*Mgfp\fP and \*Mefp\fP are
                    893: indirectly dependent on the procedure call mechanism.  The primary
                    894: requirement for the registers used as the \*Mefp\fP and \*Mgfp\fP is that
                    895: they are saved across procedure calls.  The secondary requirement is
                    896: that the \*Mgfp\fP and \*Mefp\fP always be saved in a procedure frame.
                    897: If the target machine has two general purpose registers that are
                    898: always saved in a procedure frame,\^ those two registers are quite
                    899: suitable for the \*Mgfp\fP and \*Mefp\fP.
                    900: .PP
                    901: If the procedure call mechanism does not always save a pair of general
                    902: purpose registers,\^ the problem is more complicated.
                    903: There are stack manipulations that are performed that
                    904: \fIrequire\fP saved values of \*Mefp\fP and \*Mgfp\fP to be present
                    905: in procedure and generator frames.
                    906: For built-in procedures and Icon procedures
                    907: this is no problem because \*Minvoke\fP creates the procedure frame
                    908: for them and can insure that the registers are saved.  On the VAX,\^ for the C
                    909: routines that are directly called from \*Minterp\fP,\^ no such
                    910: assurances can be made because the VAX C compiler directs
                    911: only the registers used in a routine to be saved in the C procedure
                    912: frame.  This creates a problem because Icon counts on the registers
                    913: being saved.  The problem is countered by making the C compiler think
                    914: that certain registers are used in certain routines.  Specifically,\^
                    915: declarations for a pair of \*Mregister int\fP variables are placed at
                    916: the start of appropriate routines.  On the VAX,\^ the first two local
                    917: variables declared in a C routine \fIalways\fP get allocated to \*Mr10\fP and
                    918: \*Mr11\fP.  Thus,\^ \*Mr10\fP and \*Mr11\fP are used for the \*Mgfp\fP
                    919: and the \*Mefp\fP respectively.  If the target machine is like
                    920: the VAX in that it doesn't always save certain registers,\^ a similar
                    921: tactic may need to be used.  If this is the case,\^ try compiling a
                    922: routine with a pair of \*Mregister int\fP variables declared and see
                    923: what the compiler does.  If the compiler saves the two registers
                    924: assigned to the variables,\^ use those registers for the \*Mgfp\fP and
                    925: the \*Mefp\fP.  It is wise to attempt to be sure that the compiler is
                    926: deterministic in making its choice of registers to allocate to the
                    927: variables.  Routines that require this ruse to be employed have a line
                    928: containing the string \*MDclSave\fP as the first line of the
                    929: declarations.  \*MDclSave\fP is defined in \*Mrt.h\fP and should be
                    930: set to an appropriate value.  It may be the case that no registers
                    931: need to be saved.  If so,\^ define \*MDclSave\fP,\^ but specify no value.
                    932: This is done for the PDP-11.
                    933: .PP
                    934: It is also necessary to select a register to use as the interpreter
                    935: program counter (\*Mipc\fP).  Any general register that is preserved
                    936: across procedure calls is suitable.  The VAX uses \*Mr9\fP for
                    937: the \*Mipc\fP.
                    938: .NH 2
                    939: Machine and System Specific Values
                    940: .PP
                    941: Edit \*Mh/rt.h\fP and search for the first \*MPORT\fP section.
                    942: Define the various constant values as outlined below.
                    943: .IP \*MMAXHEAPSIZE\fP
                    944: .br
                    945: The size of the heap storage region in bytes.  The VAX uses 50k and
                    946: the PDP-11 uses 10k.  If you have a small machine,\^ use 10k.  Larger
                    947: machines should use larger values,\^ such as that for VAX.
                    948: .IP \*MMAXSTRSPACE\fP
                    949: .br
                    950: The size of the string storage region in bytes.  As with
                    951: \*MMAXHEAPSIZE\fP,\^ this value is somewhat arbitrary.  A value similar
                    952: to that used for the heap size should be used.
                    953: .IP \*MSTACKSIZE\fP
                    954: .br
                    955: The size of co-expression stacks in words.  Use 1000 for smaller
                    956: machines,\^ 2000 for larger ones.
                    957: .IP \*MMAXSTACKS\fP
                    958: .br
                    959: The number of co-expression stacks initially allocated.  Use 2 for
                    960: smaller machines,\^ 4 for larger ones.
                    961: .IP \*MNUMBUF\fP
                    962: .br
                    963: The number of i/o buffers available.  When a file is opened,\^ a buffer
                    964: is assigned to the file if one is available.  A value from 5 to
                    965: 10 is recommended.
                    966: .IP \*MINTSIZE\fP
                    967: .IP \*MLOGINTSIZE\fP
                    968: .IP \*MLONGS\fP
                    969: .IP \*MMINSHORT\fP
                    970: .IP \*MMAXSHORT\fP
                    971: .br
                    972: These constants must be set to the values they were given (if any) in
                    973: \*Mlink/ilink.h\fP.
                    974: .IP \*MMINLONG\fP
                    975: .br
                    976: The smallest value that can be represented in a \*Mlong\fP.
                    977: .IP \*MMAXLONG\fP
                    978: .br
                    979: The largest value that can be represented in a \*Mlong\fP.
                    980: .IP \*MLGHUGE\fP
                    981: .br
                    982: The highest base-10 exponent plus 1 representable by a \*Mfloat\fP.
                    983: For example,\^ on the VAX,\^ the highest number representable by a \*Mfloat\fP
                    984: is about 1.7x10\u38\d.  Thus,\^ \*MLGHUGE\fP is 39 on the VAX.
                    985: .IP \*MFRAMELIMIT\fP
                    986: .br
                    987: As discussed above,\^ set \*MFRAMELIMIT\fP to the number of words
                    988: between the low word of the procedure frame marker and the word that
                    989: the procedure frame pointer references.
                    990: .IP \*MSTKBASE\fP
                    991: .br
                    992: This value represents the approximate base of the stack when
                    993: execution begins.  On machines such as the VAX,\^ where the stack grows
                    994: down from high memory,\^ \*MSTKBASE\fP should have a high value,\^ where
                    995: on machines where the stack grows up from low memory,\^ \*MSTKBASE\fP
                    996: should have a low value.  The \fIman\fP page for \fIexec(2)\fP usually
                    997: specifies the initial value for the stack pointer when program
                    998: execution begins.  If uncertain,\^ be extreme with the value.
                    999: .IP \*MGRANSIZE\fP
                   1000: .br
                   1001: The granularity of memory allocations.  Calls to \fIbrk(2)\fP are
                   1002: used to expand the main memory that is being used.  When \fIbrk\fP
                   1003: is given an address to expand to,\^ it rounds it to a multiple of
                   1004: a certain number.  That number should be used for \*MGRANSIZE\fP.
                   1005: The \fIman\fP page for \fIbrk(2)\fP should state what value is used
                   1006: on a particular system.
                   1007: .IP \*MDclSave\fP
                   1008: .br
                   1009: Give \*MDclSave\fP the value needed as previously described.
                   1010: .IP \*MEntryPoint(x)\fP
                   1011: .br
                   1012: \*MEntryPoint\fP is a macro that is used to yield the address of the
                   1013: first instruction of the C routine \*Mx\fP that is past any procedure
                   1014: entry protocol instructions.  On the VAX,\^ the register mask is two
                   1015: bytes long and thus the first executable instruction of a routine \*Mx\fP
                   1016: is at \*M(char *)x + 2\fP.  On the PDP-11,\^ there is a four-byte instruction
                   1017: at the start of each routine that calls the routine \*Mcsv\fP to
                   1018: save registers and establish the procedure frame.  Thus for the
                   1019: PDP-11,\^ \*MEntryPoint(x)\fP is \*M(char *)x + 4\fP.  Values calculated
                   1020: by \*MEntryPoint\fP are used in \*Minvoke\fP.
                   1021: .IP \*MDummyFcn(name)\fP
                   1022: .br
                   1023: Initially,\^ each of the assembly language
                   1024: portions of the system that must be filled in
                   1025: consist of a single line of the form \*MDummyFcn(name)\fP.
                   1026: \*MDummyFcn\fP should be defined to generate \fIassembly\fP language
                   1027: statements that
                   1028: form a dummy routine with the label \*Mname\fP.  This can be as
                   1029: simple as a label and a global declaration.  It is advisable to include
                   1030: as part of the definition something that will cause a program abort.
                   1031: A halt instruction usually does the job.  Thus,\^ the system can be
                   1032: built and will function normally unless an incomplete routine is
                   1033: called.
                   1034: .IP \*MDummyDcl(x)\fP
                   1035: .br
                   1036: A macro that should expand into an assembly language declaration that
                   1037: allocates a word of storage for a variable named \*Mx\fP.
                   1038: .IP \*MDummyRef(x)\fP
                   1039: .br
                   1040: A macro that should expand into an assembly language reference to the
                   1041: variable \*Mx\fP.
                   1042: .IP \*MGlobal(x)\fP
                   1043: .br
                   1044: A macro that should expand into an assembly language
                   1045: declaration of \*Mx\fP as a global symbol.
                   1046: .IP \*Mfp\fP,\^\ \*Mefp\fP,\^\ \*Mgfp\fP,\^\ \*Mipc\fP
                   1047: .br
                   1048: It is advisable to use \*M#define\fPs for these registers rather than
                   1049: explicitly name them in the code.
                   1050: .IP \*Mcset_display\fP
                   1051: .br
                   1052: This is a rather complicated macro that is used to initialize the
                   1053: values of \*Mcset\fPs such as \*M&cset\fP and \*M&lcase\fP.  If the
                   1054: target machine has \*Mint\fPs with 32 or 16 bits,\^ then the VAX or
                   1055: PDP-11 definition (respectively) of \*Mcset_display\fP may be used.
                   1056: If this is not the case,\^ \*Mcset_display\fP will have to be
                   1057: hand-crafted and the various uses of it will have to be altered for
                   1058: the machine in question.  Briefly,\^ a \*Mcset_display\fP specifies
                   1059: which of the 256 bits that comprise a cset are to be set to 1.
                   1060: For example,\^ the \*Mcset_display\fP for \*M&cset\fP has all the bits
                   1061: set to 1,\^ while \*M&ascii\fP has the first 128 bits set to 1.
                   1062: \*Mcset\fPs are accessed using the \*Msetb\fP and \*Mtstb\fP macros
                   1063: which are also defined in \*Mrt.h\fP.
                   1064: \*Mcset_display\fPs appear in \*Miconx/init.c\fP,\^
                   1065: \*Mfunctions/bal.c\fP,\^ and \*Mfunctions/trim.c\fP.  It may also be
                   1066: necessary to modify the definitions of \*MCSETSIZE\fP,\^ \*Msetb\fP,\^
                   1067: and \*Mtstb\fP.
                   1068: .PP
                   1069: Search for the second \*MPORT\fP section.
                   1070: \*MF_NQUAL\fP,\^ \*MF_VAR\fP,\^ \*MF_TVAR\fP,\^ \*MF_PTR\fP,\^ \*MF_NUM\fP,\^
                   1071: \*MF_INT\fP,\^ and \*MF_AGGR\fP should be given the
                   1072: same values they have in \*Mlink/datatype.h\fP.
                   1073: .PP
                   1074: Once \*Mrt.h\fP has been completed,\^ an analogous file,\^ \*Mh/defs.s\fP
                   1075: must tailored.  \*Mdefs.s\fP is a subset of \*Mrt.h\fP that is
                   1076: included in assembly language files.  The \*MPORT\fP section of \*Mdefs.s\fP
                   1077: lists a number of constants that must be defined.  Use the appropriate
                   1078: values from \*Mrt.h\fP for each constant.  If all assemblers used a
                   1079: default radix of 10 for constants,\^ it would be possible to tailor
                   1080: \*Mdefs.s\fP mechanically,\^ but since this is not the case,\^
                   1081: \*Mdefs.s\fP must be modified by hand.
                   1082: .NH 2
                   1083: Complete System Compilation
                   1084: .PP
                   1085: In order to determine if there are serious C compiler problems with
                   1086: the interpreter source,\^ the entire system should be made at this
                   1087: point.  Do a
                   1088: .Ds
                   1089: make Icon
                   1090: .De
                   1091: in the root directory of the Icon distribution.
                   1092: The entire
                   1093: system should compile without any problems.  The resulting
                   1094: interpreter will be completely disfunctional,\^ but if it is built without
                   1095: any problems,\^ it provides further evidence that the C compiler is
                   1096: up to the task.
                   1097: .NH 2
                   1098: Porting the Assembly Language Routines
                   1099: .PP
                   1100: The porting of the assembly language routines is the most difficult part of
                   1101: porting Icon.  This document has a section for each assembly language routine
                   1102: and each routine is described in three ways:
                   1103: .Ds
                   1104: .ft R
                   1105: overview
                   1106: generic operation
                   1107: the routine on the VAX
                   1108: .De
                   1109: .PP
                   1110: The overview section briefly describes the action of the routine
                   1111: and how the routine may be encountered during the course of execution.
                   1112: The generic operation section tells what steps the routine
                   1113: takes to perform its given task.  Each major step that the routine takes
                   1114: is described.  These steps should be very similar from machine to
                   1115: machine.
                   1116: The section about the routine on the VAX details the
                   1117: operation of the routine on the VAX.  This section complements
                   1118: the comments contained in the source code for the routine and should
                   1119: be read with the source code at hand.
                   1120: This section is very machine specific. (Ideally there would be one
                   1121: such section for each existing Icon implementation.)
                   1122: .PP
                   1123: Each routine must be formulated for the target machine.  For the most
                   1124: part,\^ the best approach is to take the same steps that are
                   1125: taken on the VAX.  It is important to select the right level for
                   1126: modeling the VAX routines.  Try to recognize the steps that are
                   1127: made rather than following the operations on a per-instruction basis.
                   1128: The most important thing is to have a good understanding of what actions
                   1129: are performed and how these can be done on the target machine.
                   1130: .PP
                   1131: The first goal is to get a very simple Icon program working.  This
                   1132: first program is \*Mtest/hello.icn\fP.  It is quite short:
                   1133: .Ds
                   1134: procedure main()
                   1135:    write("hello world")
                   1136: end
                   1137: .De
                   1138: The basis of routines mentioned above
                   1139: (\*Mstart.s\fP,\^ \*Msetbound.s\fP,\^ \*Minvoke.s\fP,\^ \*Minterp.s\fP,\^
                   1140: \*Mefail.s\fP,\^ and \*Mpfail.s\fP)
                   1141: must be implemented for even a very simple Icon program to work.
                   1142: However,\^ all these
                   1143: routines do not need to be written to make \*Mhello\fP \fIbegin\fP to work.
                   1144: .PP
                   1145: Translate and link \*Mhello\fP by running the translator and the
                   1146: linker:
                   1147: .Ds
                   1148: tran/itran hello.icn
                   1149: link/ilink hello.u1
                   1150: .De
                   1151: This creates an interpretable file named \*Mhello\fP.  Just to
                   1152: get the feel of things,\^ run the interpreter on the file:
                   1153: .Ds
                   1154: iconx/iconx hello
                   1155: .De
                   1156: A message of some type and a core dump should be produced.
                   1157: .PP
                   1158: The files \*Mtran/itran\fP,\^ \*Mlink/ilink\fP,\^ and \*Miconx/iconx\fP,\^
                   1159: are copied into the \*Mbin\fP directory as the last action of
                   1160: .Ds
                   1161: make Icon
                   1162: .De
                   1163: in the root directory.
                   1164: The porter may find it convenient to link
                   1165: these files to the \*Mbin\fP directory and then place the full pathname
                   1166: in his search path.  It is necessary to
                   1167: remove \*Mitran\fP,\^ \*Milink\fP,\^ and \*Miconx\fP in the \*Mbin\fP
                   1168: directory before linking them.  Also,\^ if the files are linked,\^ the last
                   1169: step of \fImake\fP in the root directory will fail.  This failure is
                   1170: inconsequential.
                   1171: .PP
                   1172: As \*Mstart.s\fP et al. are written,\^ try stepping through them to be
                   1173: sure the correct actions are being performed.  Most of the assembly
                   1174: language source
                   1175: files are straight line code with a branch or two and it is possible to
                   1176: do a large amount of verification of the assembly code by single stepping
                   1177: through it with a debugger.
                   1178: .PP
                   1179: When a routine has been completed,\^ it may be added to the interpreter
                   1180: by doing a \fImake\fP in the directory containing the routine and
                   1181: then doing another \fImake\fP in the directory \*Miconx\fP.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.