Annotation of researchv10no/cmd/sml/doc/papers/compiler/paper.ms, revision 1.1

1.1     ! root        1: .EQ
        !             2: delim $$
        !             3: define |-> ' down 3 size -3 | back 25 -> '
        !             4: .EN
        !             5: .nr PS 11
        !             6: .nr VS 13
        !             7: .fp 5 CW
        !             8: .ND August 5, 1987
        !             9: .TL
        !            10: A Standard ML Compiler
        !            11: .AU
        !            12: Andrew W. Appel \fR*\fP
        !            13: .FS
        !            14: * Supported by NSF Grant DCR-8603453 and by
        !            15: a Digital Equipment Corporation Faculty Incentive Grant.
        !            16: .FE
        !            17: .AI
        !            18: Dept. of Computer Science
        !            19: Princeton University
        !            20: Princeton, NJ 08544
        !            21: .AU
        !            22: David B. MacQueen \fR\(dg\fP
        !            23: .FS
        !            24: \(dg Part of this author's work was done while an SERC Senior Visiting
        !            25: Fellow at the University of Edinburgh.
        !            26: .FE
        !            27: .AI
        !            28: AT&T Bell Laboratories
        !            29: Murray Hill, NJ 07974
        !            30: .AB
        !            31: .LP
        !            32: Standard ML is a major revision of earlier dialects of the functional
        !            33: language ML.  We describe the first compiler written for Standard ML
        !            34: in Standard ML.  The compiler incorporates a number of novel features
        !            35: and techniques, and is probably the largest system written to date in
        !            36: Standard ML.
        !            37: .LP
        !            38: Great attention was paid to modularity in the construction of the
        !            39: compiler, leading to a successful large-scale test of the modular
        !            40: capabilities of Standard ML.  The front end 
        !            41: is useful for purposes other than compilation, and the back end is
        !            42: easily retargetable (we have code generators for the VAX and MC68020).
        !            43: The module facilities of Standard ML were taken into account early in
        !            44: the design of the compiler, and they particularly influenced the
        !            45: environment management component of the front end.  For example, the
        !            46: symbol table structure is designed for fast access to opened
        !            47: structures.
        !            48: .LP
        !            49: The front end of the compiler is a single phase that integrates
        !            50: parsing, environment management, and type checking.  The middle end
        !            51: uses a sophisticated decision tree scheme to produce efficient pattern
        !            52: matching code for functions and case expressions.  
        !            53: The abstract syntax produced by the front end is translated into a simple
        !            54: lambda-calculus-based intermediate representation that lends itself to
        !            55: easy case analysis and optimization in the code generator.  Special
        !            56: care was taken in designing the runtime data structures for fast
        !            57: allocation and garbage collection.
        !            58: .LP
        !            59: We describe the overall organization of the compiler and present some
        !            60: of the data representations and algorithms used in its various phases.
        !            61: We conclude with some lessons learned about the ML language itself and
        !            62: about compilers for modern functional languages.
        !            63: .AE
        !            64: .pn 1
        !            65: .nr VS 17
        !            66: .np
        !            67: .NH
        !            68: Introduction
        !            69: .LP
        !            70: The ML language is a typed functional language roughly based on
        !            71: Landin's ISWIM [1].
        !            72: It was originally designed in the
        !            73: mid-1970s as the metalanguage of Edinburgh LCF, a machine-assisted
        !            74: reasoning system, and its features were in part motivated by its
        !            75: intended use to express proof tactics.  However, these features made
        !            76: it an attractive vehicle for general purpose symbolic programming, and
        !            77: it wasn't long before free-standing implementations appeared, such as
        !            78: Cardelli's compiler[2],
        !            79: which was written in Pascal.  In 1983 a
        !            80: group of interested parties began work on an extensive revision of the
        !            81: language design that lead to Standard ML [3,\|4,\|5,\|6].
        !            82: Standard ML
        !            83: extended the earlier versions in certain ways and incorporated ideas
        !            84: from Hope[7],
        !            85: another language developed in Edinburgh in the late 1970s.
        !            86: It also included a module facility that significantly extends the
        !            87: basic polymorphic type system of earlier ML versions and supports
        !            88: large-scale program development.
        !            89: 
        !            90: Several implementations of Standard ML have been under development in
        !            91: recent years.  
        !            92: Luca Cardelli's compiler was modified to be partially
        !            93: compatible with Standard ML.  Meanwhile, Cardelli's original compiler
        !            94: had been reimplemented in its own variant of ML by Kevin Mitchell and
        !            95: Alan Mycroft at Edinburgh, and this compiler was also modified with
        !            96: the help of Robert Harper to make it conform fairly closely to the
        !            97: Standard ML definition.  A new compiler was developed in Cambridge by
        !            98: David Matthews, using his Poly language [8,\|9]
        !            99: as the implementation
        !           100: language, and sharing the Poly back end.
        !           101: At INRIA, a group headed by Gerard Huet and Guy Cousineau
        !           102: have been implementing an ML variant called CAML[10]
        !           103: that is intermediate
        !           104: between the LCF version and Standard ML.
        !           105: 
        !           106: Yet there was still justification for another Standard ML
        !           107: compiler.  One of the main motivations was to build a compiler that
        !           108: would itself be implemented in Standard ML.  Such a compiler would
        !           109: have several advantages.  It would be a good basis for building
        !           110: meta-level tools (pretty-printers, program analyzers, debuggers, etc.)
        !           111: that could share data types and code with the compiler itself and
        !           112: might be used in ML application programs or in an ML programming environment.
        !           113: It would also serve as a large-scale test of the Standard ML design,
        !           114: particularly the module facility, which was the most novel and untested
        !           115: part of the language.  Finally, a new compiler provides an
        !           116: opportunity to try out new implementation strategies (for code
        !           117: generation, optimization, type checking, and modules, for instance).
        !           118: A simple lambda-calculus intermediate representation
        !           119: makes the compiler much cleaner, and could also
        !           120: provide a base for
        !           121: research in evaluation algorithms (call-by-value, normal-order, lazy, 
        !           122: combinator-based, parallel, and others) for lambda-calculus.
        !           123: 
        !           124: As a call-by-value, higher-order functional language, many of the
        !           125: conventional compilation techniques for functional languages apply to
        !           126: Standard ML.  The distinguishing features of ML are its polymorphic
        !           127: type system, the notion of concrete data types for expressing disjoint
        !           128: unions and recursion, the use of pattern matching based on the
        !           129: constructor-functions associated with data types, the use of typed,
        !           130: value-carrying exceptions, the fact that all updateable variables are
        !           131: introduced explicitly by application of the 
        !           132: .B ref
        !           133: constructor, and
        !           134: modular programming based on signatures (interface specifications),
        !           135: structures (implementations of interfaces) and functors (functions
        !           136: from structures to structures).  These features, singly or in combination,
        !           137: produce novel challenges in compiler design.  They also, it turns
        !           138: out, can be effectively exploited in the writing of a compiler.
        !           139: 
        !           140: The compiler was written using the Edinburgh Standard ML compiler and
        !           141: then bootstrapped to compile itself.  However, the compiler shares
        !           142: no code with previous compilers; and the runtime system,
        !           143: written in C with the help of Peter Weinberger, is also completely
        !           144: new.
        !           145: .NH
        !           146: Architecture
        !           147: .LP
        !           148: The ML compiler described here consists of three phases.  The front
        !           149: end consists of a recursive descent parser into which environment
        !           150: management (determining scopes, associating binding and applied
        !           151: occurrences of variables) and type checking are integrated.  The
        !           152: output of the front end is a fully typed abstract syntax tree.
        !           153: The front end makes no decisions about runtime
        !           154: locations of variables.  It simply assigns them unique identifiers,
        !           155: and the back end makes representation decisions.
        !           156: 
        !           157: The middle end performs a simple translation of this abstract tree
        !           158: into an untyped lambda calculus representation,
        !           159: and produces an optimized representation of pattern-matching as one-dimensional
        !           160: switch expressions.
        !           161: 
        !           162: The back end transforms the lambda
        !           163: form into target machine code using a largely machine-independent code
        !           164: generator based on an ``abstract machine'' interface.  The expressions
        !           165: given to the back end never contain free variables; this simplifies
        !           166: the interface between the code generator and the runtime system.
        !           167: .NH
        !           168: The Front End
        !           169: .LP
        !           170: The front end of the compiler consists of a conventional lexical
        !           171: analyzer, abstract syntax datatypes, environment management machinery,
        !           172: a type checker, and a recursive descent parser that drives the lexical
        !           173: analysis, environment management and type checking.
        !           174: .NH 2
        !           175: Abstract syntax
        !           176: .LP
        !           177: The abstract syntax is implemented in two layers.  There is a
        !           178: collection of data types defining the ``bare'' abstract syntax
        !           179: that forms a minimal kernel of the language.  For instance, the
        !           180: \fIexpression\fP data type is defined as follows:
        !           181: .DS 
        !           182: .ft CW
        !           183: .vs 13
        !           184:    datatype exp = VARexp of var ref
        !           185:                 | CONexp of datacon
        !           186:                 | INTexp of int
        !           187:                 | REALexp of real
        !           188:                 | STRINGexp of string
        !           189:                 | RECORDexp of (numberedLabel * exp) list
        !           190:                 | SEQexp of exp list            
        !           191:                 | APPexp of exp * exp           
        !           192:                 | CONSTRAINTexp of exp * ty
        !           193:                 | HANDLEexp of exp * handler
        !           194:                 | RAISEXexp of exp
        !           195:                 | LETexp of dec * exp
        !           196:                 | CASEexp of exp * rule list
        !           197:                 | FNexp of rule list
        !           198: .vs 17
        !           199: .ft R
        !           200: .DE   
        !           201: The full abstract syntax consists of these bare syntax types augmented
        !           202: with a small number of derived forms, such as clausal function definitions.
        !           203: These derived forms are implemented as functions that generate the
        !           204: appropriate expansion into bare syntax forms.
        !           205: .NH 2
        !           206: Environment management
        !           207: .LP
        !           208: We separate the idea of symbol-table manipulation
        !           209: from the details of the kinds of bindings found in ML.
        !           210: There is a generic environment mechanism that performs binding,
        !           211: lookup, and scope management functions; it
        !           212: is implemented as a functor that takes the
        !           213: binding type as a parameter.
        !           214: The binding type is a disjoint union of the specific categories of identifiers
        !           215: (variables, constructors, type constructors, etc.):
        !           216: .DS
        !           217: .ft CW
        !           218: .vs 13
        !           219:     datatype binding 
        !           220:       = VARbind of var          \fI(* variables *)\fP
        !           221:       | CONbind of datacon      \fI(* data constructors *)\fP
        !           222:       | EXNbind of datacon      \fI(* exceptions *)\fP
        !           223:       | TYCbind of tycon ref    \fI(* type constructors (patchable) *)\fP
        !           224:       | TYVbind of tyvar        \fI(* type variables *)\fP
        !           225:       | SIGbind of signatureVar \fI(* signatures *)\fP
        !           226:       | STRbind of structureVar \fI(* structures *)\fP
        !           227:       | FCTbind of functorVar   \fI(* functors *)\fP
        !           228:       | FIXbind of fixity       \fI(* infix attributes of variables *)\fP
        !           229: .vs 17
        !           230: .ft R
        !           231: .DE
        !           232: The generic environment uses hash tables, which we call binding tables
        !           233: or simply tables, to map symbols to bindings.  The environment
        !           234: consists roughly of a stack of tables to hold the actual bindings and
        !           235: a stack of ``remarks'' that records information about bindings and
        !           236: scopes.  Only the stack of tables is used when looking up symbols,
        !           237: while the remarks are used when entering and leaving scopes and when
        !           238: binding symbols.  There are two types of tables in the stack of
        !           239: tables: those representing the bindings in a previously defined
        !           240: structure that has been ``opened'', and those representing bindings in
        !           241: currently open scopes (i.e. top-level, let-bound, and function
        !           242: parameter bindings).
        !           243: 
        !           244: As indicated above, the binding type is a union of all the
        !           245: various kinds of bindings a symbol may have.  The tables therefore
        !           246: hold bindings of all kinds, and the specialized access functions for
        !           247: variables, constructors, etc. search only for bindings of the desired
        !           248: kind.  Thus it is possible for a symbol to simultaneously have several
        !           249: bindings of different kinds, e.g. as a variable, a type constructor,
        !           250: and an exception.  However, variables and constructors share the same
        !           251: name space, so it is not possible for a symbol to be simultaneously
        !           252: bound as a variable and a constructor, despite the fact that these are
        !           253: two disjoint kinds of bindings.  The information associated with a
        !           254: symbol, such as the type of a variable or the signature of a
        !           255: structure, is contained in the projected value of its binding (e.g. the
        !           256: var or structureVar).
        !           257: .NH 2
        !           258: Parsing
        !           259: .LP
        !           260: The lexical analyzer is conventional, turning the input
        !           261: character stream into a stream of tokens, recognizing keywords,
        !           262: identifiers, and string and numeric literals.
        !           263: The parser reads tokens from the lexical analyzer and produces an
        !           264: abstract syntax tree containing type and binding information.
        !           265: Parsing is done by recursive descent, with a precedence parser
        !           266: for infix operators in expressions.  Unlike previous ML compilers, but
        !           267: like most compilers for Algol-like languages, management of the
        !           268: compile-time environment is done at the same time as parsing:  identifiers
        !           269: are entered into the environment as they are defined, and looked up
        !           270: in the environment as they are used.
        !           271: 
        !           272: Two particular problems crop up in parsing ML: identifiers may be declared
        !           273: infix, or have their precedence changed, in lexically-scoped declarations;
        !           274: and constructors cannot be syntactically distinguished from identifiers.
        !           275: To handle the infix-operator problem, the lexical analyzer makes no distinction
        !           276: between ordinary identifiers and identifiers that are declared as infix operators.
        !           277: Instead, the operator-precedence expression-parser looks up each identifier
        !           278: in the symbol table to determine if it has a fixity binding, and if it does,
        !           279: what its precedence is.  Recognizing constructors can be done similarly
        !           280: by looking up identifiers
        !           281: in the environment.
        !           282: This scheme works nicely in the presence of
        !           283: modules and
        !           284: .B open
        !           285: declarations:  if a module $A$ is opened in a scope (even a local scope),
        !           286: and an infix identifier $i$ is used from $A$, the precedence parser will
        !           287: automatically find the precedence declaration of $i$ in the environment.
        !           288: 
        !           289: Forward reference to identifiers is legal in mutually recursive definitions.
        !           290: For example, in the function declaration:
        !           291: .DS
        !           292: .ft CW
        !           293: .vs 13
        !           294: fun f(a,b) = if a=0 then b else g(a-1)
        !           295: and g(x) = f(x,x)
        !           296: .vs 17
        !           297: .ft R
        !           298: .DE
        !           299: the first occurrence of the identifier \f5g\fP is in the body of \f5f\fP,
        !           300: which is before the function \f5g\fP is declared.  To handle this problem,
        !           301: all free identifiers found in the body of a recursive function definition are
        !           302: kept on a ``backpatch list;'' when the mutually recursive declarations
        !           303: are completely parsed, any references to identifiers defined in those
        !           304: declarations are patched.  This extends straightforwardly to nested
        !           305: mutually recursive definitions.  Note that constructors are not treated
        !           306: this way; constructors cannot be redefined by
        !           307: .B fun
        !           308: or
        !           309: .B
        !           310: val rec
        !           311: .R
        !           312: definitions, so they are not put on the backpatch list.  This means, of course,
        !           313: that all identifiers must be looked up just to determine if they are constructors.
        !           314: .NH 2
        !           315: Type Checking
        !           316: .LP
        !           317: The conventional polymorphic type checking algorithm[11,\|12]
        !           318: is used at present.
        !           319: The basis of the algorithm
        !           320: is the unification of type terms by destructively instantiating type
        !           321: variables so that they become indirections to other types.  The
        !           322: administration of generic or bound type variables is done more
        !           323: systematically than in the Cardelli and Edinburgh compilers, but
        !           324: generic types are still copied for each applied occurrence of a
        !           325: variable.  A new type representation and type checking algorithm based
        !           326: on structure sharing (in the Boyer-Moore sense [13])
        !           327: has been prototyped with the help of Nick Rothwell.  In this new scheme,
        !           328: a polymorphic type is represented as a scheme whose bound variables are
        !           329: indices into an environment vector; different instantiations share the
        !           330: scheme but have different environment vectors.
        !           331: 
        !           332: Overloading is accommodated to a limited extent.  Currently only
        !           333: certain predefined primitives such as ``+'' are overloaded, and there
        !           334: are no facilities provided by the language for defining new overloaded
        !           335: identifiers.  Overloading is implemented as in the Edinburgh compiler,
        !           336: where a type scheme is associated with the overloaded identifier and
        !           337: each overloading (variant) has a type that is an instance of that
        !           338: scheme.  The contextual type of an occurrence of the identifier is
        !           339: matched against the scheme and the resulting instantiations of the
        !           340: scheme type variables is used to choose the appropriate variant.
        !           341: This is simpler though more restrictive than the technique used in Hope[7],
        !           342: where variants could have arbitrary types, as long as they
        !           343: were incomparable (i.e. one was not an instance of another).
        !           344: .NH 2
        !           345: Reference and exception types
        !           346: .LP
        !           347: The treatment of references and exceptions with ``open'' types
        !           348: is new, and is based on the fact that the contents of a reference
        !           349: cell cannot be constrained to be polymorphic, and so must be considered
        !           350: to be monomorphic.  The following example illustrates the problem.
        !           351: .DS
        !           352: .ft CW
        !           353: .vs 13
        !           354: let val s = ref (fn x => x)
        !           355:  in s := (fn x => x+1); (!s) true
        !           356: end
        !           357: .vs 17
        !           358: .ft R
        !           359: .DE
        !           360: If s were given the polymorphic type 
        !           361: $roman "ALL"^ alpha "." ( alpha  -> alpha ) bold roman "ref"$, then this
        !           362: expression would type check, permitting an obvious type error.  To prevent
        !           363: this, we insist that the type of an applied occurrence of the ref
        !           364: constructor should always be given a ``ground'' type (one with no locally-bound
        !           365: type variables).
        !           366: 
        !           367: However, functions whose application can create
        !           368: reference variables can still have polymorphic types of a restricted
        !           369: kind.  Consider the declaration
        !           370: .DS
        !           371: .ft CW
        !           372: .vs 13
        !           373: val F = fn x => let val r = ref x
        !           374:                  in !r
        !           375:                 end
        !           376: .vs 17
        !           377: .ft R  
        !           378: .DE
        !           379: Here the function F can be given polymorphic type
        !           380: $roman "ALL"^ alpha sup 1 roman "."
        !           381: alpha sup 1 -> alpha sup 1$ where $alpha sup 1$
        !           382: is a special kind of type variable
        !           383: called a \fIweak\fP type variable (the superscript ``1'' indicates that
        !           384: there is one lambda abstraction suspending the creation of the
        !           385: ref cell).  When F is applied to an argument, a reference value
        !           386: of type $alpha sup 1$ is created, and hence this weak type
        !           387: variable must be instantiated to a ground type.  This means that
        !           388: an expression like (F nil) would not be properly typed.  In contrast,
        !           389: the type 
        !           390: $alpha sup 1 bold roman "ref"$
        !           391: assigned to r is permissible because $alpha sup 1$ is implicitly
        !           392: bound in an outer scope and within the scope of its binding is treated as
        !           393: a constant type.
        !           394: 
        !           395: Exception declarations raise similar problems, which are handled
        !           396: by an analogous use of weak type variables.
        !           397: 
        !           398: The Cardelli and Edinburgh compilers used an earlier form of this
        !           399: treatment proposed by Damas[14].
        !           400: This earlier version was looser in one respect (it allowed unbound
        !           401: weak type variables in types) and more restrictive
        !           402: in another (multiple levels of lambda abstraction were not 
        !           403: allowed), and it had some rather counterintuitive effects.
        !           404: .NH 2
        !           405: Modules
        !           406: .LP
        !           407: The simplest kind of module in Standard ML is called a structure.  It is
        !           408: most simply defined as an encapsulated set of declarations, as in
        !           409: .DS
        !           410: .ft CW
        !           411: .vs 13
        !           412:    structure S =
        !           413:      struct
        !           414:         type t = int * int
        !           415:         val x = (13, 18)
        !           416:      end
        !           417: .vs 17
        !           418: .ft R
        !           419: .DE
        !           420: The compile-time symbol table representation of such a structure is
        !           421: a binding table containing a binding for each identifier
        !           422: declared in the structure.  There is also an abstract syntax tree for
        !           423: the structure, from which code is generated.
        !           424: 
        !           425: A signature is an interface specification for a structure.  For
        !           426: instance, the structure above will match the following signature:
        !           427: .DS
        !           428: .ft CW
        !           429: .vs 13
        !           430:    signature SS =
        !           431:      sig
        !           432:         type t
        !           433:         val x : t
        !           434:      end
        !           435: .vs 17
        !           436: .ft R
        !           437: .DE
        !           438: Signatures are also represented by binding tables, but an abstract
        !           439: syntax tree is not needed since there is no code generated for
        !           440: signatures.
        !           441: 
        !           442: A functor is a structure abstracted with respect to one or more
        !           443: structure parameters, which are characterized by signatures.  It
        !           444: follows that the representation of a functor consists of a structure
        !           445: representation (the body) and a parameter specification.  As part of
        !           446: the parameter specification one can impose sharing constraints among
        !           447: the parameters; these constraints are represented by transforming the
        !           448: parameter specification into a directed acyclic graph of parameter
        !           449: structures.
        !           450: 
        !           451: Functors introduce some additional forms of structures, namely formal
        !           452: parameter structures and structures defined by functor applications.
        !           453: For formal parameters, the parameter signature serves as a virtual
        !           454: template defining the components of the structure.  In the case of 
        !           455: functor applications, the functor body (which may itself be a functor
        !           456: application) is closed with respect to the environment formed by
        !           457: binding actual parameters to the formals.
        !           458: 
        !           459: Functor application in principle involves a beta-reduction in which
        !           460: the actual structure parameters are substituted for the formal
        !           461: parameters in the functor body to create the result structure.  This
        !           462: beta-reduction acts at two levels\(emthe static level, involving the
        !           463: type aspects of the structures involved, and the dynamic level,
        !           464: involving the value and exception components.  The dynamic aspect of
        !           465: the reduction is realized by generating code that performs an ordinary
        !           466: function application.  The static aspect is performed at compile time,
        !           467: producing a static representation of the result structure by
        !           468: ``instantiating'' the body of the functor using the formal/actual
        !           469: parameter binding environment.
        !           470: 
        !           471: .NH 3
        !           472: Naive copying
        !           473: .LP
        !           474: This static instantiation of functor bodies can be implemented in
        !           475: several ways.  The simplest and most direct is to perform a
        !           476: straightforward nondestructive substitution, copying the
        !           477: representation of the functor body in the process.  This is roughly
        !           478: the approach followed by Harper and Matthews in their implementations,
        !           479: but the space consumed has been found to be excessive.
        !           480: .NH 3
        !           481: Functor application closures
        !           482: .LP
        !           483: A second
        !           484: alternative initially employed in our compiler was to represent
        !           485: structures formed by functor applications as closures, consisting of
        !           486: the unmodified functor body structure together with the parameter
        !           487: binding environment.  This approach saves some copying, but it turns
        !           488: out to be rather unwieldy because functor applications can appear
        !           489: within the body of a functor definition, and consequently the actual
        !           490: parameters may be expressions that themselves refer to other formal
        !           491: parameter variables.  For instance, consider
        !           492: .DS
        !           493: .ft CW
        !           494: .vs 13
        !           495: functor F(X:sigX) = bodyF
        !           496: 
        !           497: functor G(Y:sigY) = 
        !           498:   struct
        !           499:     structure S = F(Y)
        !           500:     ...
        !           501:   end
        !           502: 
        !           503: structure B = G(A)
        !           504: .vs 17
        !           505: .ft R
        !           506: .DE   
        !           507: In this example, the interpretation of \f5B.S\fP is given in terms of the
        !           508: closure of \f5F(X)\fP, which is a pair $< bodyF , ~ roman "{" X |->
        !           509:  Y roman "}" > $.  Since the
        !           510: binding of \f5X\fP involves \f5Y\fP, this must be interpreted in the additional
        !           511: context of the binding environment 
        !           512: $ roman "{" Y |-> A roman "}" $ produced by the
        !           513: application \f5G(A)\fP.  In general, a functor application closure may need
        !           514: several layers of such contexts to be properly defined.
        !           515: 
        !           516: Another drawback of this approach 
        !           517: is that it still performs the
        !           518: copying that is inherent in the naive implementation
        !           519: of signature matching, where a declaration such as
        !           520: .DS
        !           521: .ft CW
        !           522: .vs 13
        !           523: structure S': sig1 = S
        !           524: .vs 17
        !           525: .ft R
        !           526: .DE   
        !           527: may cause a partial copy of \f5S\fP to be constructed and bound to \f5S'\fP.
        !           528: .NH 3
        !           529: Structure-sharing
        !           530: .LP
        !           531: The final approach is inspired, like Harper's prototype implementation,
        !           532: by the semantic model developed by Harper, Milner, and Tofte[15].
        !           533: We return to the straightforward idea of
        !           534: actually performing the static reductions to obtain instantiated
        !           535: copies of the functor body, but a structure-sharing representation is
        !           536: used to minimize the amount of copying.  In this representation, which
        !           537: is similar in principle to the structure-sharing representation of
        !           538: polytypes alluded to above, the statically ``interesting'' components
        !           539: of each structure (i.e. types and substructures) are represented by
        !           540: indices into an environment vector that is associated with the binding
        !           541: table in another, simpler form of closure.  The copying of structures that is
        !           542: entailed by functor application and signature matching can then be
        !           543: reduced to copying the closure objects and their associated
        !           544: environment vectors, leaving the binding tables themselves unaffected.
        !           545: 
        !           546: Each type or structure component also has an identifying ``name'',
        !           547: which is basically just a unique number or time stamp.  These names
        !           548: serve a couple of purposes: representing sharing constraints, and
        !           549: identifying two sorts of ``bound'' components.  To capture sharing
        !           550: specifications in signatures, components that are required to ``share''
        !           551: (i.e. represent views of the same structure)
        !           552: are given the same name.  There are two forms of bound components:
        !           553: those incorporated in functor parameter structures, and those
        !           554: representing the components of a functor body that are created each
        !           555: time the functor is applied.  All other names are free, and represent
        !           556: actual structures.  The first sort of bound names are mapped to
        !           557: components by the process of signature matching between formal and
        !           558: actual functor parameters.  The second sort are replaced by new unique
        !           559: names during the static elaboration of functor applications.
        !           560: 
        !           561: This third approach is currently under development and will eventually
        !           562: replace the implementation based on functor application closures.
        !           563: .NH
        !           564: Translating abstract syntax into lambda calculus
        !           565: .LP
        !           566: The middle end of the compiler translates type-checked abstract
        !           567: syntax trees into lambda-calculus trees.  Because all of the environment
        !           568: and scope manipulation has been done by the front end, and all of the
        !           569: abstract-machine manipulation will be done by the back end, the
        !           570: translator is simple, small, and fast.
        !           571: 
        !           572: Though most optimization is purposely left for the back end, some
        !           573: simplification
        !           574: is done in the middle end.  Formally recursive
        !           575: .B fun
        !           576: definitions are examined to see if they really contain references  to
        !           577: themselves;
        !           578: if not they are replaced by (simpler) nonrecursive definitions.
        !           579: The composition of structure-creation and structure-thinning is evaluated
        !           580: at compile-time rather than run-time.
        !           581: The ML equality predicate is a special
        !           582: function that can be applied to any concrete data type (one built up from
        !           583: primitive types and reference types using record and datatype constructors).
        !           584: The algorithm for testing equality is a recursive tree traversal that varies
        !           585: in its details for each type to which it is applied.
        !           586: For each such occurrence, the translator builds an equality predicate
        !           587: appropriate to the instance.
        !           588: 
        !           589: Recent changes to Standard ML will require the use of an ``equality
        !           590: interpreter'' which traverses arbitrary structures in the runtime system,
        !           591: distinguishing certain kinds of cells (like 
        !           592: .B ref
        !           593: cells) by special descriptor tags.  The present implementation can still
        !           594: be used as an optimized version in those cases where enough is known about
        !           595: the type at compile time.
        !           596: .NH 2
        !           597: Translation of pattern-matching
        !           598: .LP
        !           599: One important and nontrivial job of the middle end
        !           600: is to select optimal comparison
        !           601: sequences for the compilation of pattern-matching.  A \fImatch\fP in ML
        !           602: is a sequence of pattern-expression pairs, called \fIrules\fP.  When
        !           603: a match is applied to an argument, the argument is matched against the
        !           604: patterns, and the first rule with a matching pattern
        !           605: is selected and its expression
        !           606: is evaluated.  A pattern is either a constant, which must match the argument
        !           607: exactly; a variable, which matches any argument (and is bound to it for
        !           608: the purposes of evaluating the expression); a tuple of patterns, which matches
        !           609: a corresponding tuple argument whose components match the components of the
        !           610: pattern-tuple; or a constructor applied to a pattern, which matches
        !           611: an argument built using that constructor if the rest of the pattern matches.
        !           612: 
        !           613: As an example, consider the case statement:
        !           614: .DS
        !           615: .ft CW
        !           616: .vs 13
        !           617: case a
        !           618:  of (false, nil)   =>  nil
        !           619:   | (true, W)    =>   W
        !           620:   | (false, cons(X, nil))  =>  cons(X, cons(X, nil))
        !           621:   | (false, cons(Y, Z))  =>  Z
        !           622: .vs 17
        !           623: .ft R
        !           624: .DE
        !           625: The argument (false, cons(4,nil)) matches the third pattern, while the
        !           626: argument (true, cons(4,nil)) matches the second pattern.
        !           627: 
        !           628: One could imagine a naive compilation of matches just by testing the
        !           629: rules in turn as called for by the semantics.  Our approach is to transform a 
        !           630: sequence of patterns into a decision tree[16].
        !           631: Each internal node of the decision tree corresponds to a matching test and
        !           632: each branch is labeled with one of the possible results of the matching test
        !           633: and with a list of the patterns that remain potential candidates in that
        !           634: case.
        !           635: It is then straightforward to translate the decision tree into code for
        !           636: pattern matching.  During the construction of the decision tree it is also
        !           637: easy to determine whether the pattern set is ``exhaustive,'' meaning that
        !           638: every possible argument value matches at least one pattern; and whether
        !           639: there are any ``redundant'' patterns that only match arguments covered
        !           640: by previous rules.  Nonexhaustive and redundant patterns result in warning
        !           641: messages by the compiler.
        !           642: 
        !           643: Our goal in constructing the decision tree is simply to minimize the total
        !           644: number of test-nodes.  This minimizes the size of the generated code and
        !           645: also generally reduces the number of tests performed on value terms.
        !           646: However, finding the decision tree with the minimum number of nodes is
        !           647: an NP-complete problem[16];
        !           648: so a set of efficient heuristics is used that in practice produces an optimal
        !           649: decision tree in almost all cases.
        !           650: 
        !           651: In the example above, testing the first component of the pair for truth
        !           652: or falsity suffices to distinguish the second rule from the others;
        !           653: then testing the second component to see whether it is \f5cons\fP or
        !           654: \f5nil\fP distinguishes the first rule from the last two; one more test
        !           655: suffices to separate the last two rules.  Thus, in just two or three
        !           656: tests, the appropriate rule can be selected; instead of two or three
        !           657: tests \fIper rule\fP that the naive algorithm would use.
        !           658: 
        !           659: The details of this algorithm were originally worked out by Marianne Baudinet
        !           660: and have been implemented in our compiler by Trevor Jim.
        !           661: .NH 2
        !           662: The Lambda Language
        !           663: .LP
        !           664: The front end of the compiler translates ML source into
        !           665: lambda calculus; the back end translates lambda-calculus into machine
        !           666: code for the VAX or MC68020.
        !           667: A significant advantage of having a very simple lambda-calculus
        !           668: as the intermediate representation is that many compiler
        !           669: optimizations can be cleanly described.  This is the approach
        !           670: successfully taken in the Rabbit[17]
        !           671: and Orbit[18]
        !           672: compilers for Scheme.
        !           673: 
        !           674: The ``lambda language'' is simply an
        !           675: ML datatype, as follows:
        !           676: .DS
        !           677: .ft CW
        !           678: .vs 13
        !           679: datatype lexp
        !           680:   = VAR of lvar
        !           681:   | FN of lvar * lexp
        !           682:   | FIX of lvar list * lexp list * lexp
        !           683:   | APP of lexp * lexp
        !           684:   | CON of con * lexp
        !           685:   | DECON of con * lexp
        !           686:   | SWITCH of lexp * (con*lexp) list * lexp Option
        !           687:   | RECORD of lexp list
        !           688:   | SELECT of int * lexp
        !           689:   | RAISE of lexp
        !           690:   | HANDLE of lexp * lexp
        !           691: .vs 17
        !           692: .ft R
        !           693: .DE
        !           694: The elements of the lambda language are variables (VAR),
        !           695: abstraction (FN), simultaneous recursive definition (FIX),
        !           696: application (APP), constructors (CON, DECON, and SWITCH),
        !           697: tuples (RECORD and SELECT), and exception handling (HANDLE and
        !           698: RAISE).
        !           699: Exception names, and integer, real, and string literals,
        !           700: are represented as constructors.
        !           701: 
        !           702: There are no built-in library functions, at least from the point of
        !           703: view of the compiler;  instead, the library module is lambda-bound
        !           704: at top level to some variable $v$, and functions from the
        !           705: library are just components of the structure $v$.  Certain functions
        !           706: that are compiled in-line are represented as fields of a special structure
        !           707: bound to a distinguished variable $v sub 0$.
        !           708: 
        !           709: Constructor-expressions are explicitly represented in the lambda
        !           710: language using $roman "CON" ( c, e) $ representing the application
        !           711: of the constructor $c$ to the expression $e$, and $roman "DECON"
        !           712: (c,e)$ representing the removal of $c$ from the constructed
        !           713: expression $e$.
        !           714: The lambda language could be simplified by
        !           715: representing constructor-expressions as pairs of (tag,value),
        !           716: and removing the CON and DECON operators.  The advantage of using
        !           717: CON and DECON is that the lambda language can be a typed lambda
        !           718: calculus; the disadvantage is that the code generator is somewhat
        !           719: more complicated.  Since it's not clear why the low-level
        !           720: representation needs to be typable, we may change the representation
        !           721: of constructors in the lambda language.
        !           722: 
        !           723: A structure resembles a record of components;
        !           724: in the lambda language, it is treated exactly as a RECORD.  There
        !           725: are no special forms in the lambda language for structures and functors.
        !           726: Functors are treated as functions from structures (RECORDs) to structures.
        !           727: Actually, a structure definition is not just a record; it is an expression
        !           728: that \fIevaluates\fP to a record.
        !           729: 
        !           730: Structure $A$ may refer to an element of structure $B$; in the lambda language
        !           731: this will mean that $B$ is a free variable of the expression representing
        !           732: $A$.  To keep things simple, it is important to eliminate all free variables
        !           733: from an expression before code is generated for it; so the variable $B$ will
        !           734: be lambda-bound at top level in the definition of $A$, and the runtime system
        !           735: will be instructed to apply the resulting pseudo-functor to the structure
        !           736: $B$ to initialize it.
        !           737: This is a form of automatic,
        !           738: compiler-controlled link-loading.  For a link-loading scheme
        !           739: it is extremely simple; our runtime structure-manager and link-loader
        !           740: is less than a page of ML code.
        !           741: .NH
        !           742: Code generation
        !           743: .LP
        !           744: The back end of the compiler is organized as the composition of
        !           745: functors; these functors are most naturally described starting at the end and working
        !           746: toward the front.  The last phase is the back-patching of jumps and
        !           747: other relative addresses in a machine-language program.  Relative
        !           748: jump instructions on many machines are of different sizes depending on the distance
        !           749: jumped, and several iterations of estimating jump sizes may be
        !           750: required before a fixed point is found[19].
        !           751: This is handled in a machine-independent way by the Backpatch
        !           752: functor:
        !           753: .DS
        !           754: .ft CW
        !           755: .vs 13
        !           756: signature RelativeAddresses =
        !           757:   sig  type JumpKind
        !           758:        val sizeJump : JumpKind -> int
        !           759:        val emitJump : (int -> unit) -> JumpKind -> unit
        !           760:   end
        !           761: .ft R
        !           762: .vs 17
        !           763: .DE
        !           764: 
        !           765: .DS
        !           766: .ft CW
        !           767: .vs 13
        !           768: signature BackPatch =
        !           769:   sig
        !           770:     type Label
        !           771:     val newlabel : unit -> Label
        !           772:     type JumpKind
        !           773:     val emitbyte : int -> unit
        !           774:     val align : unit -> unit
        !           775:     val define : Label -> unit
        !           776:     val jump : JumpKind*Label -> unit
        !           777:      . . .
        !           778:   end
        !           779: 
        !           780: functor Backpatch( Rel : RelativeAddresses ) : BackPatch
        !           781:   = struct
        !           782:         . . .
        !           783:     end
        !           784: .vs 17
        !           785: .ft R
        !           786: .DE
        !           787: For a given machine with a particular kind of relative address, one
        !           788: builds a small structure explaining what size of relative address can
        !           789: reach what distance, and how to emit code for that sort of relative
        !           790: address.  For example, on the Vax, a short conditional branch
        !           791: takes 2 bytes, but a medium-size conditional
        !           792: branch requires 6 bytes (a short conditional branch around a longer
        !           793: unconditional jump), and a longer conditional jump takes 8 bytes.
        !           794: 
        !           795: When the Backpatch functor is applied to a structure of type RelativeAddresses,
        !           796: a new structure is created that understands how to
        !           797: backpatch the code for a particular machine.  A code generator for
        !           798: that machine can make use of the \fIemit\fP and \fIlabel\fP primitives provided
        !           799: by the specialized Backpatch structure.
        !           800: 
        !           801: The signature VaxCode (not shown here)
        !           802: provides an interface to a useful subset of the
        !           803: Vax instruction set.  The various instructions and addressing modes
        !           804: are defined in this signature.  There are two structures that
        !           805: implement the VaxCode signature in different ways:  one that
        !           806: generates machine code when the instructions in the signature are
        !           807: called for, and one that generates assembly code.
        !           808: The VaxMcode structure matches the VaxCode signature, and
        !           809: makes use of the Backpatch functor:
        !           810: .DS
        !           811: .ft CW
        !           812: .vs 13
        !           813: structure VaxMcode : VaxCode =
        !           814: struct
        !           815:    structure R : RelativeAddresses =
        !           816:     struct 
        !           817:        . . . functions to compile relative jumps, etc.
        !           818:     end
        !           819:    structure B = Backpatch(R)
        !           820:    datatype AddressMode = . . .
        !           821:    fun movl (src : AddressMode, dst: AddressMode) = . . .
        !           822: end
        !           823: .vs 17
        !           824: .ft R
        !           825: .DE
        !           826: There is a peephole optimization module that operates on abstract Vax
        !           827: instructions.  It is implemented as a functor taking any VaxCode
        !           828: structure into an ``optimized'' VaxCode structure:
        !           829: .DS
        !           830: .ft CW
        !           831: .vs 13
        !           832: functor Peephole ( V : VaxCode ) : VaxCode = struct . . . end
        !           833: structure OptVaxMcode = Peephole(VaxMcode)
        !           834: .vs 17
        !           835: .ft R
        !           836: .DE
        !           837: If some other structure asks the module OptVaxMcode to produce
        !           838: a particular instruction sequence, it may be that some optimized version of
        !           839: the sequence will be generated instead.
        !           840: 
        !           841: The code generator translates programs from
        !           842: lambda language into machine code by
        !           843: means of an abstract machine intermediate form.  The abstract machine
        !           844: is similar in spirit to Cardelli's[20].
        !           845: The abstract machine interface is written as a signature in ML:
        !           846: .DS
        !           847: .ft CW
        !           848: .vs 13
        !           849: signature machine =
        !           850:   sig type Label
        !           851:       val select : int -> unit
        !           852:       val apply : unit -> unit
        !           853:       val tailrecur : int -> unit
        !           854:       val startrecord : int -> unit
        !           855:       val endrecord : unit -> unit
        !           856:        . . .
        !           857:  end;
        !           858: .vs 17
        !           859: .ft R
        !           860: .DE
        !           861: Each of these functions, when called, generates machine code for the
        !           862: corresponding operation.
        !           863: 
        !           864: There is a functor Vax that transforms a VaxCode structure into
        !           865: a Machine structure:
        !           866: .DS
        !           867: .ft CW
        !           868: .vs 13
        !           869: functor Vax(C : VaxCode) : Machine =
        !           870: struct
        !           871:   fun select j = C.movl(Displace(r0,4*j),Direct(r0))
        !           872:       .
        !           873:       .
        !           874:       .
        !           875: end
        !           876: 
        !           877: structure VaxM = Vax(OptVaxMcode)
        !           878: structure VaxA = Vax(OptVaxAcode)
        !           879: .vs 17
        !           880: .ft R
        !           881: .DE
        !           882: This Vax functor can be applied to the structure OptVaxMcode, which will
        !           883: result in an abstract machine that generates machine code for the
        !           884: Vax; or to the structure OptVaxAcode, which generates assembly
        !           885: language.  We have a similar set of structures and functors that generate
        !           886: code for the Motorola MC68020 architecture.
        !           887: 
        !           888: Finally we can take this implementation of the abstract Machine, and
        !           889: apply the Codegen functor to it:
        !           890: .DS
        !           891: .ft CW
        !           892: .vs 13
        !           893: functor Codegen(M : Machine) =
        !           894: struct
        !           895:    fun codegen(APP(a,b)) = (codegen a; codegen b; M.apply())
        !           896:      | codegen(FN(v,b)) = . . .
        !           897:     \fI(The codegen function is not quite as simple as this, of course.)\fP
        !           898: end
        !           899: 
        !           900: structure VaxMCodegen = Codegen(VaxM)
        !           901: structure M68MCodegen = Codegen(M68M)
        !           902: .vs 17
        !           903: .ft R
        !           904: .DE
        !           905: Now the function VaxMCodegen.codegen, when given a lambda expression,
        !           906: generates optimized, backpatched machine code for the Vax.
        !           907: 
        !           908: This arrangement of functors is quite satisfactory.  However, the
        !           909: details of the Machine signature might 
        !           910: be changed to make
        !           911: it less like a stack machine.  This might improve the generated
        !           912: code, especially for architectures that are not naturally
        !           913: stack-oriented.  We are considering a re-implementation of the
        !           914: code generator using continuation-passing style[17]
        !           915: and Orbit[18].
        !           916: .NH 2
        !           917: Pattern-matching in code generation
        !           918: .LP
        !           919: Many
        !           920: modern code generators are written in the form of tree-pattern matchers[21,\|22].
        !           921: By writing the lambda-calculus 
        !           922: tree data structure as an ML datatype (with constructors),
        !           923: the tree pattern matching can be directly specified as an ML function.
        !           924: There is one pattern for each constructor, e.g.
        !           925: .DS
        !           926: .ft CW
        !           927: .vs 13
        !           928: | APP(f,a) => (gen(f); gen(a); machine.apply())
        !           929: .vs 17
        !           930: .ft R
        !           931: .DE
        !           932: but there are also cases that match certain ``idioms;'' like
        !           933: .B let
        !           934: expressions, which are represented as the application of a lambda-function:
        !           935: .DS
        !           936: .ft CW
        !           937: .vs 13
        !           938: | APP(FN(w,b), a) => \fIgenerate code for \fBlet \fIw\fB=\fIa\fB in \fIb\fP
        !           939: .vs 17
        !           940: .ft R
        !           941: .DE
        !           942: There are about a dozen simple cases that handle small patterns with just one
        !           943: constructor, and a dozen more complicated cases that recognize idioms.
        !           944: 
        !           945: Some of these cases are applicable in only certain situations.  For example,
        !           946: the pattern \f5APP(VAR\ w,\ a)\fP can be compiled without a closure only if
        !           947: \fIw\fP refers to a ``known'' function.  One might like to write the
        !           948: clause for this case as:
        !           949: .DS
        !           950: .ft CW
        !           951: .vs 13
        !           952: | APP(VAR w, a) => if knownfunc(w) then applyknown(w,a) else ???
        !           953: .vs 17
        !           954: .ft R
        !           955: .DE
        !           956: The problem is that if $w$ is not a known function, then this pattern-match
        !           957: is not useful, and the pattern \f5APP(f,a)\fP should be matched.  By
        !           958: the time the else clause is reached, it is impossible to transfer control
        !           959: to the \f5APP(f,a)\fP clause.
        !           960: 
        !           961: One solution to this problem is to attach boolean conditions to patterns,
        !           962: as is done in Miranda[23]:
        !           963: .DS
        !           964: .ft CW
        !           965: .vs 13
        !           966: | APP(VAR w, a) when knownfunc(w) =>
        !           967: .vs 17
        !           968: .ft R
        !           969: .DE
        !           970: Though would add some ``syntactic sludge'' to the language, it is worth
        !           971: considering as a future extension.
        !           972: .NH 2
        !           973: Flat versus linked environments
        !           974: .LP
        !           975: An
        !           976: .I environment
        !           977: is a mapping from some sort of
        !           978: .I identifiers 
        !           979: to some sort of
        !           980: .I bindings,
        !           981: on which the fundamental operations are
        !           982: .EQ
        !           983: update ~~  ":" ~~ roman "Env" ~ times ~ roman "Ide" ~ times ~ roman "Bdg" ~~ 
        !           984: -> ~~roman "Env"
        !           985: .EN
        !           986: .EQ
        !           987: overlay ~~ ":" ~~ roman "Env" ~ times ~ roman "Env" ~~ -> ~~ roman "Env"
        !           988: .EN
        !           989: .EQ
        !           990: access ~~ ":" ~~ roman "Env" ~ times ~ roman "Ide" ~~ -> ~~roman "Bdg"
        !           991: .EN
        !           992: The update function adds an identifier and binding to an
        !           993: environment; the access function looks up an identifier; and the
        !           994: overlay function
        !           995: adds all of the bindings of one environment to
        !           996: another.  The overlay function might be used to implement the ML
        !           997: .B open
        !           998: primitive, for example.
        !           999: 
        !          1000: Environments occur both in the compiler, where the identifiers are
        !          1001: those of the source text, and the bindings are types and other
        !          1002: information; and at runtime, in the form of function closures.
        !          1003: A function closure, in implementations of lambda calculus, is a pair
        !          1004: consisting of a function-code and the values to be associated with
        !          1005: all of its free variables.
        !          1006: Though these two kinds of environments are used in different ways,
        !          1007: they share many of the same problems of implementation.
        !          1008: 
        !          1009: We know of no data structure that implements environments with both a very
        !          1010: fast  \fIaccess\fP function and a very fast \fIoverlay\fP function.
        !          1011: For compile-time environments,
        !          1012: a fast \fIaccess\fP can be achieved by using a hash table; but
        !          1013: then to overlay one hash table onto another takes time proportional
        !          1014: to the size of the smaller table.  A fast \fIoverlay\fP can be
        !          1015: achieved by representing environments as trees of \fIoverlay\fP
        !          1016: operations, but then \fIaccess\fP requires a tree search.
        !          1017: In the compiled code, closures can be represented as flat
        !          1018: or linked structures.  A flat closure[2]
        !          1019: is just a vector with one slot
        !          1020: for each free variable, containing the value to which it is bound.
        !          1021: A linked closure[17]
        !          1022: has the binding of one free variable at the
        !          1023: innermost level, along with a pointer to the closure for the
        !          1024: enclosing scope.
        !          1025: Flat closures have a fast \fIaccess\fP, but to build the vector takes
        !          1026: time proportional to the number of free variables in the function.
        !          1027: Linked closures can be built quickly: since the enclosing linked
        !          1028: closure has already been built, just one \fIcons\fP operation is required.
        !          1029: However, linked closures have a slower \fIaccess\fP function.
        !          1030: 
        !          1031: The question of whether to make \fIaccess\fP or \fIoverlay\fP fast
        !          1032: is a difficult one, since examples can be found to support either
        !          1033: decision.  Consequently, we have adopted a compromise: a mixed
        !          1034: representation will be used, so that the time required for a typical
        !          1035: sequence of \fIaccess\fP and \fIoverlay\fP operations will probably be
        !          1036: smaller than if either the flat (hash-table)  or linked (tree)
        !          1037: representation were used.
        !          1038: 
        !          1039: In the front end, environments are represented as linked lists of
        !          1040: hash tables.  The operation of \fBopen\fPing a structure just adds
        !          1041: the (already-built) hash-table for that structure onto the list, but
        !          1042: a typical \fIupdate\fP operation just adds a binding to an existing
        !          1043: hash table.
        !          1044: In the generated code, closures are represented as trees\(em
        !          1045: a compromise between flat and linked closures.  For each free variable
        !          1046: in a closure, the back end heuristically decides whether to add
        !          1047: another slot to the closure for this variable, or to access this
        !          1048: variable through the (already existing) linked closure.  The former
        !          1049: is more expensive to build, but cheaper to use.  When we have more
        !          1050: experience with this compiler, we hope to measure the relative
        !          1051: efficiencies of flat and linked closures in practice.
        !          1052: .NH 2
        !          1053: Data structures in the run-time system
        !          1054: .LP
        !          1055: The representations of ML structures in the compiled code are designed
        !          1056: to be simple and general.  We took care to avoid arbitrary restrictions
        !          1057: on the size of the address space, or of individual objects.  Except for
        !          1058: .B ref
        !          1059: cells, the graph of runtime objects is acyclic\(em
        !          1060: even mutually recursive closures are represented without cycles.
        !          1061: 
        !          1062: Because of ML's polymorphic type system, all values must be the same size.
        !          1063: We take this size to be one 32-bit word (this discussion of runtime
        !          1064: data structures is specific to implementations on 32-bit computers).
        !          1065: Any value that is naturally represented in more than one word will
        !          1066: be manipulated as a (one-word) pointer to a (multi-word) object;
        !          1067: this is known as a
        !          1068: .I boxed
        !          1069: value.  An
        !          1070: .I unboxed
        !          1071: value is one which fits in one word and is not a pointer.
        !          1072: The garbage collector must be able to tell which words are boxed (pointers)
        !          1073: and which words are not.  We use the low-order bit of the word as the
        !          1074: indicator.  Since all of our pointers (on byte-addressable machines)
        !          1075: point to word boundaries, the low-order bit of any pointer is zero.
        !          1076: Unboxed values are  represented with their low-order bits turned on.
        !          1077: Unboxed values include constructor tags,
        !          1078: one-character strings, and integers (the integer $k$ is
        !          1079: represented as $2k+1$).
        !          1080: 
        !          1081: All records in the runtime system contain an integral number of 32-bit words,
        !          1082: and are prefaced by a descriptor describing their size and type.  There are two
        !          1083: types of records: those that contain pointers (and unboxed values), and
        !          1084: those that contain no pointers.  All ML tuples and closures are of the
        !          1085: former variety (Figure 1);
        !          1086: strings and machine-code fragments are of the latter.
        !          1087: The fact that machine-code fragments contain no pointers is a consequence
        !          1088: of the fact that the back-end code generator is never applied to any
        !          1089: lambda-expression with free variables.
        !          1090: .KF
        !          1091: .PS
        !          1092: boxht = .2; boxwid = .35; a = .4
        !          1093: [down; A: box; B: box; C: box; D: box
        !          1094: "desc" at A.c
        !          1095: spline -> from B.c right a then up a/2 then right a+a
        !          1096: "78" at C.c
        !          1097: spline -> from D.c right a then down a/4 then right a 
        !          1098: line <- from B.w left a
        !          1099: "Figure 1.  A triple of two" "pointers and an integer" \
        !          1100: at 2.2 between A.c and D.c
        !          1101: ]
        !          1102: move right 4*a
        !          1103: [down; A: box; B: box; C: box; D: box; E: box
        !          1104: "desc" at A.c
        !          1105: line <- from B.w left a
        !          1106: line <- from C.w left a
        !          1107: line <- from D.w left a
        !          1108: spline -> from B.c right a then up a/2 then right a+a
        !          1109: spline -> from C.c right a then up a/4 then right a
        !          1110: spline -> from D.c right a then down a/2 then right a+a
        !          1111: "62" at E.c
        !          1112: "Figure 2.  Three mutually recursive" \
        !          1113: "functions with one free variable" \
        !          1114: at 2.2 between A.c and D.c
        !          1115: ]
        !          1116: .PE
        !          1117: .KE
        !          1118: 
        !          1119: There are several different representations used for constructors,
        !          1120: just as in Cardelli's compiler[2].
        !          1121: Constructors that carry values are usually represented as pairs of
        !          1122: (tag, value).  Constant constructors (link \fBnil\fP) are represented
        !          1123: as small unboxed integers.  Value-carrying constructors can be
        !          1124: represented completely transparently if they carry an always-boxed value,
        !          1125: and there are no other value-carrying constructors in the datatype
        !          1126: (like \fBcons\fP), or if there is only one constructor in the datatype.
        !          1127: Finally, \fBref\fP cells look like one-word records.
        !          1128: 
        !          1129: Earlier ML compilers[20]
        !          1130: represented mutually recursive functions as a set of closures that point
        !          1131: to each other in a cyclic data structure.  We wanted to avoid gratuitous
        !          1132: circularity, because many garbage collection algorithms are more efficient
        !          1133: on acyclic data structures[24].
        !          1134: We represent mutually recursive functions with just one closure 
        !          1135: record that has
        !          1136: several function-part entries (Figure 2).  To represent function $i$,
        !          1137: one just uses a pointer to the $i sup roman "th"$ field.  If the
        !          1138: $i sup roman "th"$ function wants to refer to the closure for the $j sup
        !          1139: roman "th"$ function, it just adds $ j - i $ to the pointer value for
        !          1140: its own closure.  This creates no cycles of pointers; although there
        !          1141: is an implicit cycle because each segment of machine code can refer
        !          1142: to the machine's registers, which in turn point to the closure.
        !          1143: 
        !          1144: Several machine-code fragments may be stored in the same record.
        !          1145: Because the record may be moved by the garbage collector, references within
        !          1146: the block must be by relative addresses.  Many computers have PC-relative
        !          1147: addressing modes to facilitate this; for other machines, a pointer to
        !          1148: the currently-executing block would have to be kept in a register.
        !          1149: A typical record contains all of the function and string-literal fragments
        !          1150: for an entire ML structure (module).  In order not to confuse the garbage
        !          1151: collector, just before the start of each fragment within the record
        !          1152: is a (relative) pointer back to the beginning of the record, where
        !          1153: the record's descriptor may be found; and all fragments must begin
        !          1154: on even byte addresses, so that pointers to fragments within the record
        !          1155: won't look like unboxed values.
        !          1156: .NH 2
        !          1157: Fast consing
        !          1158: .LP
        !          1159: The operation of allocating and initializing a record\(em
        !          1160: known in Lisp
        !          1161: as \fIcons\fP\(em
        !          1162: is a fundamental operation in ML.  The creation of a tuple value,
        !          1163: the application of a constructor, and the creation of a function-closure
        !          1164: all rely on this operation.  Our implementation strives to make this operation
        !          1165: as fast as possible.
        !          1166: 
        !          1167: We use a copying (and compacting) garbage collection algorithm.  One
        !          1168: consequence of this is that the free space available for storage
        !          1169: allocation is a contiguous region of memory.  We reserve a register
        !          1170: to point at the lowest address in the free region.  To allocate an
        !          1171: $n$-word initialized record, the $n sup roman "th"$ word is stored
        !          1172: first, followed by the other words and the descriptor.  Then the
        !          1173: free-space register is incremented by $n$ words.
        !          1174: 
        !          1175: At some point the free space is exhausted.  We arrange to have an
        !          1176: inaccessible page of the address space immediately beyond the free
        !          1177: space region.  When there is an attempt to allocate a record that
        !          1178: crosses into this page, a page fault occurs; and since the page is
        !          1179: marked as inaccessible, the operating system transfers control to the
        !          1180: ML runtime system instead of handling the page fault itself.  Because
        !          1181: the last word of a record is stored first, the creation of
        !          1182: any record that crosses
        !          1183: the boundary into the inaccessible region will fault at the beginning
        !          1184: (where it is easy to patch things up and recover) rather than halfway
        !          1185: through its initialization.  (A special case is required for those
        !          1186: records larger than a page; but these are rare and identifiable at
        !          1187: compile-time.)
        !          1188: Thus, the creation of an $n$-word record takes just $n$ stores 
        !          1189: (plus one for the descriptor) and an
        !          1190: add instruction; the $n$ stores are required in any case to do the
        !          1191: initialization.  This is an extremely low-overhead
        !          1192: .I cons
        !          1193: operation.
        !          1194: 
        !          1195: A fast
        !          1196: .I cons
        !          1197: is not helpful if garbage collection is slow.  When physical memory
        !          1198: is large, then the cost of copying garbage collection, amortized over
        !          1199: the
        !          1200: .I cons
        !          1201: operations, can come to less than one instruction per \fIcons\fP $~~$[25].
        !          1202: We are using a variant of generational garbage collection[24,\|26,\|27]
        !          1203: that promises to be very fast even in moderate-size memories.
        !          1204: 
        !          1205: In writing our compiler, we have aimed for clarity and
        !          1206: straightforwardness; we have eschewed the coding tricks that enable
        !          1207: programmers to avoid using dynamically allocated memory.  As a
        !          1208: result, our compiler probably \fIcons\fPes a lot more than other compilers;
        !          1209: it is fortunate that we have made
        !          1210: .I cons
        !          1211: so fast.
        !          1212: .NH 2
        !          1213: Exception Handling
        !          1214: .LP
        !          1215: Exception names in Standard ML behave a bit like constructors, except that
        !          1216: they are generatively (dynamically) defined.  The statements
        !          1217: .DS
        !          1218: .ft CW
        !          1219: .vs 13
        !          1220: exception E : int and R : real
        !          1221: \fIand\fP
        !          1222:  ... handle E with 0 => 10
        !          1223:                  | i => 2*i
        !          1224:          || R with 0.0 => 10
        !          1225:                  | r => 2*floor(r)
        !          1226: .vs 17
        !          1227: .ft R
        !          1228: .DE
        !          1229: are, if one considers $E$ and $R$ to be constructors,
        !          1230: a bit like the (hypothetical) statements
        !          1231: .DS
        !          1232: .ft CW
        !          1233: .vs 13
        !          1234: datatype exception =  . . . E of int | R of real . . .
        !          1235: \fIand\fP
        !          1236:  ... handle E(0) => 10
        !          1237:           | E(i) => 2*i
        !          1238:           | R(0.0) => 10
        !          1239:           | R(r) => 2*floor(r)
        !          1240: .vs 17
        !          1241: .ft R
        !          1242: .DE
        !          1243: Generative data constructors cannot be given small integer tags like ordinary
        !          1244: constructors.  The tag for the constructor \f5E\fP must be constructed
        !          1245: at run time.  Its representation will be that of a
        !          1246: .B ref
        !          1247: cell:  \f5ref("E")\fP.  Any match applied to the exception type can
        !          1248: compare the address of the ref cell against the tag of the exception
        !          1249: object; and if the exception is propagated out of the user program into
        !          1250: the bootstrap system, then the name of the exception can be determined
        !          1251: (for debugging) by dereferencing the tag.
        !          1252: 
        !          1253: The ML statement \f5raise e with v\fP is compiled into the lambda language
        !          1254: as \f5RAISE(CON(e,v))\fP; and the handle statement applies a match
        !          1255: (with exceptions as constructors) to the exception raised.  That is,
        !          1256: \f5HANDLE(e,h)\fP executes expression \f5e\fP; if any exception is raised
        !          1257: in \f5e\fP, then the handler \f5h\fP (which is a 
        !          1258: \fImatch\fP as above) is applied to that exception.
        !          1259: If \f5e\fP has type $t$, then \f5h\fP must have type $"exception" ^ -> ^ t$.
        !          1260: 
        !          1261: The implementation of \f5HANDLE(e,h)\fP in the code generator is
        !          1262: straightforward.  An exception handler is pushed on the stack;
        !          1263: expression \f5e\fP is executed; if no exception is raised, the
        !          1264: exception handler is popped from the stack.  An exception handler
        !          1265: has two components: the address of the machine-code for the match
        !          1266: \f5h\fP, and the address (on the stack) of the enclosing
        !          1267: exception handler.  Thus, entering the scope of a
        !          1268: .B handle
        !          1269: takes about two instructions, as does leaving its scope.
        !          1270: 
        !          1271: To raise an exception \f5E(v)\fP, first the constructor-expression
        !          1272: \f5E(v)\fP is evaluated and put into a register;
        !          1273: then the stack pointer is reset to the position
        !          1274: of the current exception handler; the current handler is removed; and
        !          1275: control passes to the text address found in the current handler.
        !          1276: The match \f5h\fP found at this address is thereby applied to the
        !          1277: exception \f5E(v)\fP previously evaluated.  Raising an exception
        !          1278: therefore takes just three or four instructions, not including the
        !          1279: match that may have to be done upon arrival at the handler to determine
        !          1280: which exception was raised.
        !          1281: 
        !          1282: The constructor view of exception
        !          1283: names is not just a convenience in implementation.
        !          1284: After some discussion in the Standard ML community, exceptions-as-constructors
        !          1285: have been incorporated into the language.
        !          1286: We no longer need 
        !          1287: an awkward notation (using \f5|\fP and \f5||\fP) for exception matches;
        !          1288: and such things as re-raising an arbitrary expression are now possible:
        !          1289: .DS
        !          1290: .ft CW
        !          1291: .vs 13
        !          1292: handle e => (clean up; raise e)    \fI(e is a variable of type ``exception'')\fP
        !          1293: .vs 17
        !          1294: .ft R
        !          1295: .DE
        !          1296: .NH
        !          1297: The runtime system
        !          1298: .LP
        !          1299: The ML standard library is implemented as a single module in the runtime
        !          1300: system.  The library functions are mostly written in ML, with some
        !          1301: references to primitive functions written in assembly language.
        !          1302: All references to library functions are treated as references
        !          1303: to fields of this structure.  Thus, a typical expression might have
        !          1304: only one free variable\(em
        !          1305: the library structure itself.  In order to ensure that the code generator
        !          1306: is given only closed expressions, this variable is lambda-bound; the
        !          1307: machine code resulting from this closed expression can then be treated as
        !          1308: a function whose argument will be the standard library.  Access to user-defined
        !          1309: structures is handled similarly.
        !          1310: 
        !          1311: This general plan is used both for the ``bootstrap'' system (in which
        !          1312: the machine code for each
        !          1313: structure is written to an external file), and
        !          1314: the ``interactive'' system
        !          1315: (in which the code for each structure or expression is
        !          1316: put in memory and executed).  In the interactive system, each expression
        !          1317: when compiled is represented as a function whose argument is the
        !          1318: current top-level environment, and whose result is the new binding.
        !          1319: Thus, the interface between the compiler and the runtime system is
        !          1320: very narrow and clean.
        !          1321: .NH
        !          1322: Conclusion
        !          1323: .LP
        !          1324: Standard ML is a complicated language.  We manage the complexity by using
        !          1325: a sequence of well-defined intermediate representations: tokens, abstract
        !          1326: syntax, lambda-calculus, abstract machine, symbolic machine instructions.
        !          1327: The datatypes and signatures of ML allow these interfaces to be cleanly
        !          1328: specified, which is a great help.
        !          1329: 
        !          1330: Lexical analysis\(em
        !          1331: the translation from source programs to tokens\(em
        !          1332: we make as simple as possible (760 lines).  All complications of recognizing
        !          1333: constructors and infix operators are handled in the parser, which
        !          1334: has better access to the environment mechanisms.
        !          1335: 
        !          1336: The translation from tokens to
        !          1337: fully typed abstract syntax is the most complicated phase of our
        !          1338: compiler (4500 lines).
        !          1339: Because compile-time environments have some effect on parsing
        !          1340: (constructors and infix operators), we merged the compilation
        !          1341: of syntax, scopes, and types.
        !          1342: This is where most of the complexity of the module features
        !          1343: appear.  But since structures and functors impinge on scopes and types
        !          1344: in fundamental ways, we decided to build them into this phase from the
        !          1345: beginning.
        !          1346: 
        !          1347: Since the abstract syntax is so well-digested and complete, the next phase\(em
        !          1348: translation into lambda-calculus\(em
        !          1349: is relatively simple (1480 lines).
        !          1350: The major task of this phase is the compilation of patterns into
        !          1351: decision trees.
        !          1352: 
        !          1353: A simple lambda-calculus representation of programs allows a simple
        !          1354: and consistent code generator that can recognize idioms in lambda calculus
        !          1355: and optimize them (750 lines).
        !          1356: The lambda-calculus can also be useful for other kinds
        !          1357: of analysis, like in-line function expansion.
        !          1358: The back end needs to know very little about front-end data structures;
        !          1359: thus, we found it simplest to have an environment manager for the back
        !          1360: end that is completely separate from the front-end environment manager.
        !          1361: Back-end environments  map applied occurrences of variables
        !          1362: in the lambda-calculus to their bindings, just as the front-end environment
        !          1363: does that job for abstract syntax trees.
        !          1364: 
        !          1365: The last two phases\(em
        !          1366: translation from abstract machine instructions into
        !          1367: symbolic (Vax or MC68020) instructions, and the translation of those
        !          1368: into backpatched byte sequences\(em
        !          1369: are the only machine-dependent phases.
        !          1370: These phases have a largely straightforward structure (950 lines per machine).
        !          1371: In the current
        !          1372: implementation, our code generators do not make adequate use of registers,
        !          1373: and we may redesign the abstract machine interface.
        !          1374: 
        !          1375: Finally, the runtime system is kept very simple (450 lines of C, 
        !          1376: including the garbage collector; 300 lines
        !          1377: of assembly language).
        !          1378: As a statically-typed
        !          1379: language, ML by its nature is oriented to compile-time analysis rather
        !          1380: than complicated runtime systems; and we push this as far as possible.
        !          1381: Our standard library is mostly implemented in ML
        !          1382: (750 lines), and those parts
        !          1383: implemented in assembly language are arranged to match a Standard ML
        !          1384: signature.  We pay particular attention to fast allocation of dynamic
        !          1385: storage, since the features of ML (constructors, function closures) necessitate
        !          1386: an efficient memory allocator and garbage collector.
        !          1387: 
        !          1388: Our compiler is concise, efficient, and (we think) well-designed.
        !          1389: It should prove to be a useful tool for the functional programming community.
        !          1390: .sp 2
        !          1391: .nr PS 10
        !          1392: .nr VS 12
        !          1393: .LP
        !          1394: .]<
        !          1395: .ds [F 1
        !          1396: .]-
        !          1397: .ds [A P. J. Landin
        !          1398: .ds [U landin
        !          1399: .ds [K ISWIM
        !          1400: .ds [T The next 700 programming languages
        !          1401: .ds [J Comm. ACM
        !          1402: .ds [V 9
        !          1403: .ds [N 3
        !          1404: .ds [P 157-166
        !          1405: .nr [P 1
        !          1406: .ds [D 1966
        !          1407: .nr [T 0
        !          1408: .nr [A 0
        !          1409: .nr [O 0
        !          1410: .][ 1 journal-article
        !          1411: .ds [F 2
        !          1412: .]-
        !          1413: .ds [A Luca Cardelli
        !          1414: .ds [U cardelli
        !          1415: .ds [T Compiling a functional language
        !          1416: .ds [I ACM
        !          1417: .ds [J 1984 Symp. on LISP and Functional Programming
        !          1418: .ds [P 208-217
        !          1419: .nr [P 1
        !          1420: .ds [D 1984
        !          1421: .nr [T 0
        !          1422: .nr [A 0
        !          1423: .nr [O 0
        !          1424: .][ 1 journal-article
        !          1425: .ds [F 3
        !          1426: .]-
        !          1427: .ds [K milner84
        !          1428: .ds [K SML
        !          1429: .ds [U milner
        !          1430: .ds [A Robin Milner
        !          1431: .ds [T A proposal for Standard ML
        !          1432: .ds [I ACM
        !          1433: .ds [J ACM Symposium on LISP and Functional Programming
        !          1434: .ds [P 184-197
        !          1435: .nr [P 1
        !          1436: .ds [D 1984
        !          1437: .nr [T 0
        !          1438: .nr [A 0
        !          1439: .nr [O 0
        !          1440: .][ 1 journal-article
        !          1441: .ds [F 4
        !          1442: .]-
        !          1443: .ds [K milner85
        !          1444: .ds [K SML
        !          1445: .ds [u milner
        !          1446: .ds [A Robin Milner
        !          1447: .ds [T The Standard ML Core Language
        !          1448: .ds [J Polymorphism
        !          1449: .ds [V 2
        !          1450: .ds [N 2
        !          1451: .ds [D October 1985
        !          1452: .nr [T 0
        !          1453: .nr [A 0
        !          1454: .nr [O 0
        !          1455: .][ 1 journal-article
        !          1456: .ds [F 5
        !          1457: .]-
        !          1458: .ds [A David MacQueen
        !          1459: .ds [T Modules for Standard ML
        !          1460: .ds [I ACM
        !          1461: .ds [J Proc. 1984 ACM Conf. on LISP and Functional Programming
        !          1462: .ds [C Austin, Texas
        !          1463: .ds [P 198-207
        !          1464: .nr [P 1
        !          1465: .ds [D 1984
        !          1466: .nr [T 0
        !          1467: .nr [A 0
        !          1468: .nr [O 0
        !          1469: .][ 1 journal-article
        !          1470: .ds [F 6
        !          1471: .]-
        !          1472: .ds [A David MacQueen
        !          1473: .ds [T Modules for Standard ML
        !          1474: .ds [J Polymorphism
        !          1475: .ds [V 2
        !          1476: .ds [N 2
        !          1477: .ds [D October 1985
        !          1478: .nr [T 0
        !          1479: .nr [A 0
        !          1480: .nr [O 0
        !          1481: .][ 1 journal-article
        !          1482: .ds [F 7
        !          1483: .]-
        !          1484: .ds [A R. Burstall
        !          1485: .as [A ", D. MacQueen
        !          1486: .as [A ", and D. Sannella
        !          1487: .ds [T Hope: an Experimental Applicative Language
        !          1488: .ds [D 1980
        !          1489: .ds [P 136-143
        !          1490: .nr [P 1
        !          1491: .ds [K burstall80
        !          1492: .ds [C Stanford
        !          1493: .ds [J Proceedings of the 1980 LISP Conference
        !          1494: .nr [T 0
        !          1495: .nr [A 0
        !          1496: .nr [O 0
        !          1497: .][ 1 journal-article
        !          1498: .ds [F 8
        !          1499: .]-
        !          1500: .ds [A David C. J. Matthews
        !          1501: .ds [T The Poly manual
        !          1502: .ds [J SIGPLAN Notices
        !          1503: .ds [D September 1985
        !          1504: .nr [T 0
        !          1505: .nr [A 0
        !          1506: .nr [O 0
        !          1507: .][ 1 journal-article
        !          1508: .ds [F 9
        !          1509: .]-
        !          1510: .ds [A David C. J. Matthews
        !          1511: .ds [T An implementation of Standard ML in Poly
        !          1512: .ds [D May 1986
        !          1513: .nr [T 0
        !          1514: .nr [A 0
        !          1515: .nr [O 0
        !          1516: .][ 0 other
        !          1517: .ds [F 10
        !          1518: .]-
        !          1519: .ds [A G. Cousineau
        !          1520: .as [A ", P. L. Curien
        !          1521: .as [A ", and M. Mauny
        !          1522: .ds [T The Categorical Abstract Machine
        !          1523: .ds [P 50-64
        !          1524: .nr [P 1
        !          1525: .ds [I Springer-Verlag
        !          1526: .ds [B Functional Programming Languages and Computer Architecture, LNCS Vol 201
        !          1527: .ds [E J. P. Jouannaud
        !          1528: .ds [D 1985
        !          1529: .nr [T 0
        !          1530: .nr [A 0
        !          1531: .nr [O 0
        !          1532: .][ 3 article-in-book
        !          1533: .ds [F 11
        !          1534: .]-
        !          1535: .ds [K milner78
        !          1536: .ds [A Robin Milner
        !          1537: .ds [U milner
        !          1538: .ds [T A Theory of Type Polymorphism in Programming
        !          1539: .ds [J J. CSS
        !          1540: .ds [V 17
        !          1541: .ds [P 348-375
        !          1542: .nr [P 1
        !          1543: .ds [D 1978
        !          1544: .nr [T 0
        !          1545: .nr [A 0
        !          1546: .nr [O 0
        !          1547: .][ 1 journal-article
        !          1548: .ds [F 12
        !          1549: .]-
        !          1550: .ds [A Luca Cardelli
        !          1551: .ds [U cardelli
        !          1552: .ds [T Basic polymorphic typechecking
        !          1553: .ds [J Polymorphism
        !          1554: .ds [V 2
        !          1555: .ds [N 1
        !          1556: .ds [D January 1985
        !          1557: .nr [T 0
        !          1558: .nr [A 0
        !          1559: .nr [O 0
        !          1560: .][ 1 journal-article
        !          1561: .ds [F 13
        !          1562: .]-
        !          1563: .ds [A R. S. Boyer
        !          1564: .as [A " and J Moore
        !          1565: .ds [U boyer
        !          1566: .ds [T The sharing of structure in theorem-proving programs
        !          1567: .ds [B Machine Intelligence 7
        !          1568: .ds [E B. Meltzer
        !          1569: .ds [E D. Michie
        !          1570: .ds [I Edinburgh University Press
        !          1571: .ds [D 1972
        !          1572: .nr [T 0
        !          1573: .nr [A 0
        !          1574: .nr [O 0
        !          1575: .][ 3 article-in-book
        !          1576: .ds [F 14
        !          1577: .]-
        !          1578: .ds [A Luis Damas
        !          1579: .ds [T Type Assignment in Programming Languages
        !          1580: .ds [R PhD Thesis
        !          1581: .ds [I Department of Computer Science, University of Edinburgh
        !          1582: .ds [D 1985
        !          1583: .nr [T 0
        !          1584: .nr [A 0
        !          1585: .nr [O 0
        !          1586: .][ 4 tech-report
        !          1587: .ds [F 15
        !          1588: .]-
        !          1589: .ds [A Robert Harper
        !          1590: .as [A ", Robin Milner
        !          1591: .as [A ", and Mads Tofte
        !          1592: .ds [T A type discipline for program modules
        !          1593: .ds [U harper
        !          1594: .ds [I Univ. of Edinburgh
        !          1595: .ds [R ECS-LFCS-87-28
        !          1596: .ds [D 1987
        !          1597: .nr [T 0
        !          1598: .nr [A 0
        !          1599: .nr [O 0
        !          1600: .][ 4 tech-report
        !          1601: .ds [F 16
        !          1602: .]-
        !          1603: .ds [A Marianne Baudinet
        !          1604: .as [A " and David MacQueen
        !          1605: .ds [T Tree Pattern Matching for ML
        !          1606: .ds [D 1986
        !          1607: .nr [T 0
        !          1608: .nr [A 0
        !          1609: .nr [O 0
        !          1610: .][ 0 other
        !          1611: .ds [F 17
        !          1612: .]-
        !          1613: .ds [K steele78
        !          1614: .ds [A Guy L. Steele
        !          1615: .ds [T Rabbit: a compiler for Scheme
        !          1616: .ds [U steele
        !          1617: .ds [I MIT
        !          1618: .ds [R AI-TR-474
        !          1619: .ds [D 1978
        !          1620: .nr [T 0
        !          1621: .nr [A 0
        !          1622: .nr [O 0
        !          1623: .][ 4 tech-report
        !          1624: .ds [F 18
        !          1625: .]-
        !          1626: .ds [A D. Kranz
        !          1627: .as [A ", R. Kelsey
        !          1628: .as [A ", J. Rees
        !          1629: .as [A ", P. Hudak
        !          1630: .as [A ", J. Philbin
        !          1631: .as [A ", and N. Adams
        !          1632: .ds [T ORBIT: An optimizing compiler for Scheme
        !          1633: .ds [P 219-233
        !          1634: .nr [P 1
        !          1635: .ds [D July 1986
        !          1636: .ds [J Proc. Sigplan '86 Symp. on Compiler Construction
        !          1637: .ds [V 21 (Sigplan Notices)
        !          1638: .ds [N 7
        !          1639: .nr [T 0
        !          1640: .nr [A 0
        !          1641: .nr [O 0
        !          1642: .][ 1 journal-article
        !          1643: .ds [F 19
        !          1644: .]-
        !          1645: .ds [A W. Wulf
        !          1646: .as [A ", R. K. Johnsson
        !          1647: .as [A ", C. B. Weinstock
        !          1648: .as [A ", C. B. Hobbs
        !          1649: .as [A ", and C. M. Geschke
        !          1650: .ds [T Design of an Optimizing Compiler
        !          1651: .ds [I Elsevier North-Holland
        !          1652: .ds [C New York
        !          1653: .ds [D 1975
        !          1654: .ds [k wulf75
        !          1655: .nr [T 0
        !          1656: .nr [A 0
        !          1657: .nr [O 0
        !          1658: .][ 2 book
        !          1659: .ds [F 20
        !          1660: .]-
        !          1661: .ds [A Luca Cardelli
        !          1662: .ds [U cardelli
        !          1663: .ds [T The functional abstract machine
        !          1664: .ds [J Polymorphism
        !          1665: .ds [V 1
        !          1666: .ds [N 1
        !          1667: .ds [D January 1983
        !          1668: .nr [T 0
        !          1669: .nr [A 0
        !          1670: .nr [O 0
        !          1671: .][ 1 journal-article
        !          1672: .ds [F 21
        !          1673: .]-
        !          1674: .ds [A R. G. G. Cattell
        !          1675: .ds [T Formalization and automatic derivation of code generators
        !          1676: .ds [I Carnegie-Mellon University
        !          1677: .ds [C Pittsburgh, PA
        !          1678: .ds [D April 1978
        !          1679: .ds [K cattell78
        !          1680: .ds [R Ph.D. Thesis
        !          1681: .nr [T 0
        !          1682: .nr [A 0
        !          1683: .nr [O 0
        !          1684: .][ 4 tech-report
        !          1685: .ds [F 22
        !          1686: .]-
        !          1687: .ds [K agt86
        !          1688: .ds [K twig
        !          1689: .ds [A A. V. Aho
        !          1690: .as [A ", M. Ganapathi
        !          1691: .as [A ", and S. W. K. Tjiang
        !          1692: .ds [U aho
        !          1693: .ds [T Code generation using tree matching and dynamic programming
        !          1694: .ds [D 1986
        !          1695: .nr [T 0
        !          1696: .nr [A 0
        !          1697: .nr [O 0
        !          1698: .][ 0 other
        !          1699: .ds [F 23
        !          1700: .]-
        !          1701: .ds [A D. A. Turner
        !          1702: .ds [T Miranda: a non-strict functional language with polymorphic types
        !          1703: .ds [I Springer-Verlag
        !          1704: .ds [P 1-16
        !          1705: .nr [P 1
        !          1706: .ds [B Functional Programming Languages and Computer Architecture, LNCS Vol 201
        !          1707: .ds [E J. P. Jouannaud
        !          1708: .ds [D 1985
        !          1709: .nr [T 0
        !          1710: .nr [A 0
        !          1711: .nr [O 0
        !          1712: .][ 3 article-in-book
        !          1713: .ds [F 24
        !          1714: .]-
        !          1715: .ds [K lieberman83
        !          1716: .ds [U lieberman
        !          1717: .ds [A Henry Lieberman
        !          1718: .as [A " and Carl Hewitt
        !          1719: .ds [T A real-time garbage collector based on the lifetimes of objects
        !          1720: .ds [J Communications of the ACM
        !          1721: .ds [I ACM
        !          1722: .ds [V 23
        !          1723: .ds [N 6
        !          1724: .ds [P 419-429
        !          1725: .nr [P 1
        !          1726: .ds [D 1983
        !          1727: .nr [T 0
        !          1728: .nr [A 0
        !          1729: .nr [O 0
        !          1730: .][ 1 journal-article
        !          1731: .ds [F 25
        !          1732: .]-
        !          1733: .ds [U appel
        !          1734: .ds [A A. W. Appel
        !          1735: .ds [T Garbage collection can be faster than stack allocation
        !          1736: .ds [J Information Processing Letters
        !          1737: .ds [V (to appear)
        !          1738: .ds [D 1987
        !          1739: .nr [T 0
        !          1740: .nr [A 0
        !          1741: .nr [O 0
        !          1742: .][ 1 journal-article
        !          1743: .ds [F 26
        !          1744: .]-
        !          1745: .ds [U moon
        !          1746: .ds [K moon84
        !          1747: .ds [A David A. Moon
        !          1748: .ds [T Garbage collection in a large LISP system
        !          1749: .ds [D 1984
        !          1750: .ds [P 235-246
        !          1751: .nr [P 1
        !          1752: .ds [I ACM
        !          1753: .ds [J ACM Symposium on LISP and Functional Programming
        !          1754: .nr [T 0
        !          1755: .nr [A 0
        !          1756: .nr [O 0
        !          1757: .][ 1 journal-article
        !          1758: .ds [F 27
        !          1759: .]-
        !          1760: .ds [A David Ungar
        !          1761: .ds [T Generation scavenging: a non-disruptive high performance storage
        !          1762: .as [T " reclamation algorithm
        !          1763: .ds [I ACM
        !          1764: .ds [J SIGPLAN Notices (Proc. ACM SIGSOFT/SIGPLAN Software Eng. Symp. on Practical Software
        !          1765: .as [J " Development Environments)
        !          1766: .ds [V 19
        !          1767: .ds [N 5
        !          1768: .ds [D 1984
        !          1769: .ds [P 157-167
        !          1770: .nr [P 1
        !          1771: .nr [T 0
        !          1772: .nr [A 0
        !          1773: .nr [O 0
        !          1774: .][ 1 journal-article
        !          1775: .]>

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.