Annotation of researchv10no/cmd/sml/doc/papers/runtime/run.tex, revision 1.1

1.1     ! root        1: \documentstyle[11pt]{article}
        !             2: \bibliographystyle{plain}  %changed from alpha
        !             3: \documentstyle{openbib}
        !             4: \makeindex
        !             5:     \oddsidemargin   .5in     %   Left margin on odd-numbered pages.
        !             6:     \evensidemargin  .5in     %   Left margin on even-numbered pages.
        !             7:    \topmargin       .5in    %    Nominal distance from top of page to top of
        !             8:    \headheight        0pt    %    Height of box containing running head.
        !             9:    \headsep           0pt    %    Space between running head and text.
        !            10: % Bottom of page:
        !            11:    \footheight       12pt    %    Height of box containing running foot.
        !            12:    \footskip         30pt    %    Distance from baseline of box containing 
        !            13:                              %    foot to baseline of last line of text.
        !            14:    \textwidth         5.5in  %    width of text on page
        !            15: 
        !            16: %\newcommand{\mysection}[1]{\section{\protect\large\bf #1}}
        !            17: %\setcounter{secnumdepth}{0}
        !            18: 
        !            19: \sloppy
        !            20: 
        !            21: \begin{document}
        !            22: \title{A Runtime System \\ (DRAFT)}
        !            23: \author{Andrew W. Appel\thanks{
        !            24: Supported in part by NSF Grants
        !            25: DCR-8603453 and CCR-880-6121}
        !            26:         \\ Princeton University}
        !            27: \date{February 1989}
        !            28: \maketitle
        !            29: \begin{abstract} 
        !            30: The runtime data structures of the Standard ML of New Jersey compiler
        !            31: are simple yet general.  As a result, code generators are easy to implement,
        !            32: programs execute quickly, garbage collectors are easy to implement and
        !            33: work efficiently, and a variety of runtime facilities can be provided with
        !            34: ease.
        !            35: \end{abstract}
        !            36: 
        !            37: \section{Introduction}
        !            38: \label{introduction}
        !            39: 
        !            40: Some languages, like Lisp, Smalltalk, ML, Prolog, etc. rely heavily on a
        !            41: {\em runtime system} to provide essential services.  The most important
        !            42: such service is the management of dynamically-allocated storage
        !            43: (e.g. {\bf garbage collection}), but the runtime system may also provide
        !            44: facilities like
        !            45: 
        !            46: \begin{itemize}
        !            47: \item {\bf Stream input/output:} on operating systems (like Unix) that do not
        !            48: have a buffered input/output facility, the process must provide its own;
        !            49: this might be handled in the runtime system.
        !            50: \item {\bf Structured input/output:} the ability to automatically
        !            51: write a large linked data structure to a file, and read it back in
        !            52: with all links adjusted, is a great convenience that can be implemented
        !            53: efficiently in the runtime system.
        !            54: \item {\bf Process suspension:} a snapshot of an
        !            55: executing process may be ``preserved'' in a file, so that
        !            56: the execution of that file causes a new process to start exactly
        !            57: where the saved one left off.
        !            58: \item {\bf Operating system calls:} operating system services needed by
        !            59: a program may be conveniently packaged by the runtime system.
        !            60: \item {\bf Handling of interrupts and asynchronous events:} if the programming
        !            61: language has a mechanism to handle asynchronous events, it relies on the
        !            62: runtime system for its implementation.
        !            63: \item {\bf Handling of arithmetic exceptions:} the programming language's
        !            64: exception-handling mechanism must be implemented in cooperation with
        !            65: the runtime system.
        !            66: \item {\bf Assembly-language implementation of language primitives:}
        !            67: it may be inconvenient for the compiler to generate code for some features
        !            68: of the programming language; these functions can be implemented as calls
        !            69: to runtime system routines.
        !            70: \item {\bf Foreign-language procedure calls:} calls to subroutines written
        !            71: in other languages may be mediated by the runtime system.
        !            72: \item {\bf Fun with continuations:} Languages with features like {\em call with
        !            73: current continuation}, which allows the explicit manipulation of threads
        !            74: of control, require runtime-system cooperation.
        !            75: \item {\bf Execution profiling:} Automatic measurement of the time spent in
        !            76: different parts of the user program can be accomplished with the help of
        !            77: the runtime system.
        !            78: \item {\bf Debugging:} starting, stopping, and displaying the execution state
        !            79: of user programs can be accomplished only by low-level routines.
        !            80: \end{itemize}
        !            81: 
        !            82: Since this list is rather long, and several of these features may interact,
        !            83: it is evident that runtime systems can become nasty and complicated.
        !            84: The proliferation of data types may make the implementation of 
        !            85: the garbage collector (and other programs that must traverse the data)
        !            86: inefficient.
        !            87: 
        !            88: We particularly wanted a runtime data layout that provided fast
        !            89: allocation of records and fast garbage collection, since ML makes very
        !            90: heavy use of dynamically-allocated storage.  To this end, we eliminated
        !            91: the runtime stack and use a very simple data format.
        !            92: The only runtime data types are {\em integers}, {\em pointers},
        !            93: {\em records}, and {\em strings}.
        !            94: Then we found that this very simple structure allowed the easy implementation
        !            95: of many of the services described above.
        !            96: 
        !            97: Our runtime system has relatively few ML-specific
        !            98: features; it could be used equally well for other languages.  This paper
        !            99: describes the design and implementation of the SML-NJ runtime system.
        !           100: It should be read in conjunction with an earlier paper on the
        !           101: SML-NJ generational garbage collector\cite{appel89:sggc}.
        !           102: 
        !           103: \begin{quotation}{
        !           104: \small Text in smaller type is of interest only to those who might
        !           105: have to read, modify, or maintain the source code to the runtime system.
        !           106: }\end{quotation}
        !           107: 
        !           108: \section{Standard ML of New Jersey}
        !           109: \label{standard}
        !           110: The ML language originated as part of the Edinburgh LCF proof system
        !           111: \cite{gordon78}, and was soon implemented as a stand-alone compiler
        !           112: \cite{cardelli83:ml}\cite{cardelli84}.  The language was
        !           113: ``standardized'' \cite{milner84}\cite{macqueen84}
        !           114: \cite{milner85}\cite{harper88}, and
        !           115: Standard ML has been implemented at Edinburgh, Cambridge, and 
        !           116: New Jersey\cite{appel87:sml}; the New Jersey implementation is
        !           117: a joint effort between researchers at AT\&T Bell Laboratories and
        !           118: Princeton University.
        !           119: 
        !           120: Though ML was first implemented as the meta-language of a theorem-proving
        !           121: system, Standard ML is a general-purpose 
        !           122: programming language with several advantages over more conventional
        !           123: languages.  Its important characteristics are:
        !           124: 
        !           125: \begin{itemize}
        !           126: \item {\bf Automatic garbage collection:} this is a great convenience in
        !           127: writing correct and readable programs.
        !           128: \item {\bf Static, polymorphic types:}  like Pascal, types are checked
        !           129: at compile-time and not at runtime; but like Lisp, there is great flexibility
        !           130: and re-usability of code.
        !           131: \item {\bf Safety:} there are no runtime insecurities (i.e. ``core is never
        !           132: dumped''); this is unlike the C language, where unsafe pointers run
        !           133: rampant, and like Lisp (except when Lisp programmers turn off the runtime
        !           134: type checking for efficiency).  In ML, there is no run-time type checking,
        !           135: but safety is guarateed by compile-time type checking.
        !           136: \item {\bf Higher-order functions:} like Scheme (and lambda-calculus); this
        !           137: can lead to a desirable conciseness of expression.
        !           138: \item {\bf Typechecked Modules:} like Ada and Modula.
        !           139: \item {\bf Exception handling:} a dynamically-scoped exception
        !           140: mechanism allows both hardware- and software-generated exceptions
        !           141: to be caught and handled by user programs.
        !           142: \end{itemize}
        !           143: 
        !           144: One important feature of a systems programming language
        !           145: is not prescribed by the list above, and in fact cannot be specified
        !           146: by a language definition: an efficient and robust implementation.
        !           147: Standard ML of New Jersey\cite{appel87:sml} is meant to be a complete,
        !           148: efficient, robust, and cleanly written compiler for the language.
        !           149: It has an optimizing code generator based on continuation-passing style
        !           150: \cite{appel89:cps}.
        !           151: 
        !           152: A knowledge of ML is not necessary to understand its runtime system.
        !           153: 
        !           154: \section{Tagging schemes}
        !           155: \label{tagging}
        !           156: 
        !           157: Almost all the pieces of the runtime system must deal with the data structures
        !           158: of the executing program.  Therefore, it is helpful to keep the format of
        !           159: this data as simple and straightforward as possible.
        !           160: 
        !           161: ML, like Lisp, allows polymorphic functions: a function that reverses a list
        !           162: of objects (for example) need not know the type of the objects.  In order that
        !           163: the same piece of executable code can operate on objects of arbitrary type,
        !           164: it is necessary that every object be represented in the same amount of space.
        !           165: As in Lisp, this is achieved by making everything be the size of a pointer;
        !           166: if an object's natural representation is larger (as for a record of $n$
        !           167: objects), then it is represented by a pointer to storage on the heap.
        !           168: 
        !           169: The garbage collector must be able to determine the size and layout of each
        !           170: object it traverses.  This can be accomplished in several ways:
        !           171: \begin{itemize}
        !           172: \item {\bf  By encoding a type-tag\index{tag}
        !           173: inside each pointer;} we chose not to do this
        !           174: because it reduces the number of bits left for actual addressing.
        !           175: In these days of 32-bit pointers and 100-megabyte memories, it is easy
        !           176: to imagine the need for every bit of addressability we can muster.
        !           177: \item {\bf By reserving different areas of memory for objects of different
        !           178: types.}
        !           179: BIBOP\index{BIBOP} (Big Bag Of Pages) schemes require that each page hold objects of a single
        !           180: type, and a global table maps pages to types.  This is relatively efficient,
        !           181: and doesn't reduce the number of addressable words.  But it complicates the
        !           182: process of allocating and copying objects, since several free regions
        !           183: are simultaneously required.  
        !           184: \item {\bf Statically-typed languages don't require any runtime-tagging} at all;
        !           185: instead, the compiler can tell the garbage collectector about the type
        !           186: system of the program.  This has worked well in Pascal\cite{britton75}.
        !           187: Even though it is also theoretically possible in ML, in practice the
        !           188: polymorphic type system introduces overhead and complexity that make
        !           189: this method unattractive\cite{appel89:tag}.
        !           190: \item {\bf Each record can have a descriptor at the beginning} that explains
        !           191: which fields are integers and which are pointers.  This method works
        !           192: well in non-polymorphic languages like Modula-2+ and Mesa, where
        !           193: descriptors can be computed at compile-time and just inserted
        !           194: in records as they are created.  However,
        !           195: in polymorphic languages like Lisp and ML
        !           196: this descriptor would have to be laboriously
        !           197: constructed each time a record is created, introducing unacceptable overhead.
        !           198: \item {\bf Each record can have a descriptor\index{descriptor} at the
        !           199: beginning} that tells
        !           200: the length of the record, and {\bf each field can have a tag bit} that tells
        !           201: whether it is a pointer or a non-pointer.
        !           202: This is what we have done.
        !           203: \end{itemize}
        !           204: 
        !           205: To have a tag word on each record, and a tag bit on each
        !           206: field, is not clever at all; but we are not always
        !           207: striving for cleverness, we 
        !           208: want simplicity and efficiency.  If all our records were {\em cons} cells,
        !           209: then one-third of memory would be devoted to tag words; but memories
        !           210: are large and cheap.  And in Standard ML, records are of many different
        !           211: sizes, and two-word records are not particularly predominant.
        !           212: 
        !           213: By having only one free region, we are
        !           214: able to do fast allocation\index{allocation}
        !           215: and also manage the total amount of virtual
        !           216: memory used\cite{appel89:sggc}; this would be much more difficult
        !           217: with a BIBOP scheme.
        !           218: 
        !           219: \section{Data formats in our runtime system}
        !           220: \label{data}
        !           221: Standard ML of New Jersey has
        !           222: records, strings, procedures (machine code),
        !           223: closures, constructors, arrays, byte-arrays, floating-point
        !           224: numbers, references (modifiable one-word records),
        !           225: modules, and integers.  This large set of language primitives and
        !           226: user-defined datatypes are represented
        !           227: by just two runtime data formats: one for objects that contain pointers
        !           228: (records, closures, constructors, arrays, references, modules), and
        !           229: the other for objects containing no pointers (strings, procedures,
        !           230: byte-arrays, floating point).  The garbage collector (and other
        !           231: parts of the runtime system) need to understand only these two formats,
        !           232: not the many kinds of ML objects.
        !           233: 
        !           234: A {\em field\index{field}} is either a 
        !           235: {\em pointer\index{pointer}} or an {\em integer\index{integer}}.  Pointers
        !           236: have a low-order bit of 0; integers have a low-order bit of 1.
        !           237: It is necessary to distinguish pointers from non-pointers in order that
        !           238: the garbage collector\index{garbage collector}
        !           239: will know what structures to traverse.
        !           240: On a byte-addressable machine, all pointers to aligned (4-byte) words
        !           241: are multiples of 4 anyway, so pointers have a low-order bit of 0 in
        !           242: their natural representation.  The high-order 31 bits of an integer
        !           243: field can represent an integer in the range $[-2^{30}, +2^{30}-1]$.
        !           244: 
        !           245: Some Lisp implementations use a similar representation except that
        !           246: pointers have a low-order 1 and integers have a low-order 0. This makes
        !           247: arithmetic on tagged integers somewhat easier, and takes advantage
        !           248: of the fact that pointers are usually used with a displacement addressing mode;
        !           249: the displacement can be adjusted by 1 with no penalty in efficiency.
        !           250: This is probably a better arrangement, but 
        !           251: either version of this scheme will work; and a simple analysis\cite{appel89:tag}
        !           252: shows that the efficiency trade-offs are negligible, and that 
        !           253: doing arithmetic around the low-order integer tag is not very costly.
        !           254: 
        !           255: An {\em object on the heap} may be either a {\em record\index{record}}
        !           256: (containing fields,
        !           257: i.e. pointers and tagged integers) or
        !           258: a {\em string\index{string}}
        !           259: (containing bytes of an arbitrary bit-pattern, but no pointers).
        !           260: 
        !           261: A {\em record} is a sequence of $n>0$ fields numbered $0,1,...,n-1$.
        !           262: Each record has a {\em descriptor} at position $-1$.  The low-order bit
        !           263: of a descriptor\index{descriptor} is 1
        !           264: (making it look like an integer), the next three
        !           265: bits identify the object as a record, and the high-order 28 bits give the
        !           266: number of fields.  Thus, each record is limited to one gigabyte, which
        !           267: should not be a significant limitation (seriously, though, it is important
        !           268: to avoid miserly restrictions on the sizes of objects).
        !           269: 
        !           270: In the implementation of closures, it is useful to be able to point
        !           271: at the interior of records.  However, it is always necessary for the
        !           272: garbage collector, given a pointer, to find the descriptor of an object.
        !           273: If the
        !           274: fields $0$ through $k-1$ of a record are all pointers, then a pointer
        !           275: can point at the $k^{\rm th}$ field of the record; since the
        !           276: descriptor is unboxed (has a low-order bit of 1) and the fields $0$ through
        !           277: $k-1$ will all be boxed, the garbage collector can easily search backward
        !           278: for the descriptor of the record.  So, we allow pointers to the interior
        !           279: of a record, providing all the previous fields are boxed; this is sufficiently
        !           280: flexible for our needs in implementing closures, as will be described.
        !           281: Otherwise,
        !           282: no pointer may address the interior of a record.
        !           283: 
        !           284: A {\em string\index{string}}
        !           285: contains $n>0$ bytes (numbered from $0$ to $n-1$),
        !           286: padded with trailing zero bytes
        !           287: to a multiple of four.  
        !           288: Immediately preceding the $0^{\rm th}$ byte is a one-word
        !           289: descriptor whose low-order bit is 1, whose next three bits describe
        !           290: the type of string (and distinguish strings from records),
        !           291: and whose upper 28 bits give the length in bytes.
        !           292: 
        !           293: Strings are used for a variety of purposes;
        !           294: they can hold printable characters, real numbers, machine code,
        !           295: or any other kind of data that doesn't contain pointers.  In fact,
        !           296: a string object can hold several embedded string objects.
        !           297: A pointer may point at the $0^{\rm th}$ byte of a string, and may point to 
        !           298: an embedded string at the
        !           299: $k^{\rm th}$ byte provided that $k>0$ is a multiple of 4 and that the bytes
        !           300: $k-4, k-3, k-2, k-1$ contain a {\em back-pointer\index{back-pointer}}
        !           301: descriptor that
        !           302: contains the number $k$ (the offset to the beginning of the record).
        !           303: This allows the beginning of a string object
        !           304: to be found quickly, given a pointer into it.
        !           305: Note that a back-pointer couldn't be distinguished from data, except
        !           306: for the fact that it is found just before the pointed-to byte of the
        !           307: string.
        !           308: 
        !           309: The three tag\index{tag} bits of a descriptor\index{descriptor}
        !           310: can denote any of these kinds of records,
        !           311: strings, etc:
        !           312: 
        !           313: \begin{itemize}
        !           314: \item[0] record\index{record}
        !           315: \item[1] forwarding-pointer\index{forwarding pointer}
        !           316: (for the garbage collector\index{garbage collector})
        !           317: \item[2] back-pointer (that precedes an referenceable location in a string)
        !           318: \item[3] embedded string\index{embedded string} length (described below)
        !           319: \item[4] array\index{array} (just like a record).
        !           320: \item[5] byte-array\index{byte-array}
        !           321: (just like a string).
        !           322: \item[6] this tag value is unused.
        !           323: \item[7] string\index{string}
        !           324: \end{itemize}
        !           325: 
        !           326: Because of ML's static type system, it is not necessary to put type information
        !           327: in the descriptors of objects.  Therefore, tags are necessary only
        !           328: for the garbage collector's benefit, since it must distinguish objects of
        !           329: different formats.
        !           330: 
        !           331: The only exception to this rule is that mutable objects (arrays, byte-arrays)
        !           332: must be distinguishable from immutable objects (records, strings)
        !           333: in order to implement the polymorphic
        !           334: equality\index{equality}
        !           335: feature.  For all purposes of the runtime system, arrays
        !           336: are identical to records, and byte-arrays are identical to strings.
        !           337: 
        !           338: A dynamically-typed language like Lisp would require more than
        !           339: three bits to distinguish types of objects; this would not pose a 
        !           340: serious problem.  It would still be possible in Lisp to have just
        !           341: two formats (pointer-containing and pointer-free), as in our runtime system.
        !           342: 
        !           343: \section{Forwarding pointers}
        !           344: \label{forwarding}
        !           345: \index{forwarding}
        !           346: The Standard ML of New Jersey runtime system has a generational garbage
        !           347: collector\index{garbage collector}
        !           348: that takes advantage of object lifetime and referencing
        !           349: patterns\cite{appel89:sggc}.
        !           350: But at the heart of any generational garbage collector is a simple
        !           351: copying garbage collector as originally described by Cheney\cite{cheney70}.
        !           352: Objects are copied from {\em fromspace} to {\em tospace} in a breadth-first
        !           353: order, with the {\em tospace} itself serving as the queue for the breadth-first
        !           354: traversal.  The {\em fromspace} versions of objects are overwritten
        !           355: with {\em forwarding pointers}, so that when other references to them are
        !           356: found, it is easy to find the {\em tospace} copies of them.
        !           357: 
        !           358: The fundamental operation in Cheney's algorithm is to {\em forward} a pointer.
        !           359: This means taking a pointer into {\em fromspace} and making it point to {\em tospace}.
        !           360: If the {\em fromspace} object it points to has already been copied, then its
        !           361: forwarding pointer is taken as the new value; otherwise, the object must
        !           362: be copied to {\em tospace} and a forwarding pointer installed.
        !           363: 
        !           364: Forwarding is relatively easy using our runtime data format.  We have a special
        !           365: kind of descriptor {\bf forwarding-pointer}, that indicates that a {\em fromspace}
        !           366: object has already been copied.  If an object has this descriptor, then
        !           367: the first word (after the descriptor) is to be interpreted as the address
        !           368: of the copy.  The only complications in forwarding are that pointers
        !           369: may point into the middle of records and strings.
        !           370: 
        !           371: Our runtime system (and garbage collector) is implemented in C.
        !           372: In figures~\ref{forward}, \ref{trappc}, \ref{gc} we show the entire
        !           373: core of our copying garbage collector.  We could use pseudo-code
        !           374: and describe the algorithm abstractly, but we want to emphasize that
        !           375: our simple runtime data format does permit an efficient and easy-to-implement
        !           376: garbage collector.
        !           377: 
        !           378: We pretend that all ML values are integers, and make liberal use of
        !           379: casts to maintain this pretense.  Our \verb"forward" function is
        !           380: shown in figure~\ref{forward}; it
        !           381: takes an ML value (by reference) and modifies it (if a pointer into
        !           382: {\em fromspace}) to point into {\em tospace}.
        !           383: \begin{figure}[htbp]
        !           384: \label{forward}
        !           385: \begin{verbatim}
        !           386:  1    forward(int *refloc)
        !           387:  2    {register int *m = *((int**)(refloc)), len;
        !           388:  3     if( (m&1)==0 && (m >= (int*)lowest && m < (int*)highest))
        !           389:  4     { m--;  /* make m point at the descriptor */
        !           390:  5       while(1)
        !           391:  6         {len = (*m)>>4;
        !           392:  7          switch(m&15)
        !           393:  8           {case tag_backptr:
        !           394:  9                    m -= len;
        !           395: 10                    continue;
        !           396: 11            case tag_string:
        !           397: 12            case tag_bytearray:
        !           398: 13                len = (len+3)/4;
        !           399: 
        !           400: 19                /* fall through */
        !           401: 20            case tag_record:
        !           402: 21            case tag_array:
        !           403: 22                {int **i=(int**)m, **j=to_ptr;
        !           404: 23                     while (j+len >= to_lim)
        !           405: 24                       do to_lim=gmore();
        !           406: 25                     while (len-- >= 0)          
        !           407: 26                       do {*j++ = *i++;}
        !           408: 27                     ((int**)m)[1]= 1+(int*)to_ptr;
        !           409: 28                     to_ptr = j;
        !           410: 29                }
        !           411: 30                (*m) = tag_forwarded;
        !           412: 31                /* fall through */
        !           413: 32            case tag_forwarded:
        !           414: 33                *(int*)(refloc) += ((int*)m)[1] - ((int)(m+1));
        !           415: 34                return;
        !           416: 35            case tag_embedded:
        !           417: 36            default:
        !           418: 37                    m--; continue;
        !           419: 38   }}   }  }
        !           420: \end{verbatim}
        !           421: \caption{The {\tt forward} function}
        !           422: \end{figure}
        !           423: Line 2 establishes $m$ as a copy of the original value to be forwarded.
        !           424: Line 3 considers only the case that $m$ is a pointer, and points into
        !           425: {\em fromspace}.  (Any pointers not into {\em fromspace} are treated as constants.)
        !           426: Line 4 adjusts m to point at the descriptor of the {\em fromspace} object.
        !           427: 
        !           428: Lines 5--38 loop until the beginning of the object is found.
        !           429: If [line 8] $m$ points to a back-pointer\index{back-pointer},
        !           430: the appropriate offset is subtracted
        !           431: and we start again; similarly, if [lines 35,36] $m$ points to a pointer
        !           432: (which can happen if we
        !           433: point into the middle of a closure\index{closure} record) or an 
        !           434: embedded\index{embedded string} descriptor,
        !           435: then $m$ is decremented and we try again.
        !           436: 
        !           437: If $m$ is a string\index{string length} or byte array [line 13],
        !           438: then we adjust its $len$ to count in words rather
        !           439: than bytes (i.e. we divide by 4, rounding up).
        !           440: (Lines 14--18 are explained below.)
        !           441: 
        !           442: Now at line 22 we have either a string or record in {\em fromspace}
        !           443: that needs to be copied.  We verify that there's enough space remaining
        !           444: in {\em tospace}; if not, we call the function \verb"gmore" that will either
        !           445: \index{gmore}
        !           446: get more space or die trying.  Then we copy all the words of the
        !           447: object.  Finally [lines 27 and 30] we install a forwarding pointer
        !           448: into the {\em fromspace} object and mark its descriptor as forwarded.
        !           449: 
        !           450: For an already forwarded object (line 33), whether it was forwarded
        !           451: in a previous call or just now, adjust the reference \verb|*refloc|
        !           452: to point at the new object.  Since the reference might have pointed
        !           453: at the middle of the old object, we must take care to make it point
        !           454: to the corresponding location in the new object; this is accomplished
        !           455: by the computation:
        !           456: \[
        !           457: {\rm old\;pointer} + {\rm beginning\;of\;new\;object} - 
        !           458: {\rm beginning\;of\;old\;object.}
        !           459: \]
        !           460: 
        !           461: Then the \verb"forward" function returns.
        !           462: 
        !           463: Almost without exception, all pointers to objects point to places where
        !           464: an immediately preceding descriptor\index{descriptor}
        !           465: explains the format of the object.
        !           466: The exception is an artifact of the 
        !           467: very fast allocation mechanism that can create and initialize
        !           468: a $k$-word object in $k+2$ instructions\cite{appel89:sggc}.
        !           469: This mechanism relies on a page fault\index{fault}
        !           470: trap to invoke the garbage collector
        !           471: when the free space is exhausted; when this trap occurs, the program
        !           472: \index{pc}
        !           473: counter will point into the middle of a string object that might
        !           474: be moved by the garbage collector.  This is the only reference that
        !           475: points to a place in a string that lacks a backpointer\index{back-pointer}.
        !           476: It can be handled
        !           477: relatively simply: as each string object is moved, we check to see
        !           478: if the saved program counter points into the middle of it; if so,
        !           479: we adjust the saved program counter.  Since the number of string objects
        !           480: is typically less than one percent of the number of record objects,
        !           481: very costly overall.  The test occurs at line 14, as shown in
        !           482: figure~\ref{trappc}.
        !           483: \index{trap_pc}
        !           484: \begin{figure}[htbp]
        !           485: \label{trappc}
        !           486: \begin{verbatim}
        !           487: 
        !           488: 14                if (!trap_pc_done 
        !           489: 15                     && m < trap_pc && m+len >= trap_pc)
        !           490: 16                    {trap_pc_done=1;
        !           491: 17                     trap_pc += to_ptr - (int**)m;
        !           492: 18                    }
        !           493: \end{verbatim}
        !           494: \caption{Testing the (saved) program counter}
        !           495: \end{figure}
        !           496: 
        !           497: \section{Garbage collection}
        !           498: \label{garbage}
        !           499: 
        !           500: Once the \verb"forward"ing procedure is written, a simple copying garbage
        !           501: collector is trivial (figure~\ref{gc}).
        !           502: \index{garbage collector}
        !           503: \begin{figure}[htbp]
        !           504: \label{gc}
        !           505: \begin{verbatim}
        !           506:  1    gc(int *from_low, int *from_high,
        !           507:  2       int *to_low, int *to_high,
        !           508:  3       int **roots,
        !           509:  4       int *trap_pcx, int *(*get_more)()
        !           510:  5      )
        !           511:  6    { gmore=get_more; trap_pc = *trap_pcx; to_ptr = to_low;
        !           512:  7      trap_pc_done = !(trap_pc>=(int*)from_low 
        !           513:  8                      && trap_pc<(int*)from_high);
        !           514:  9      lowest=from_low; highest=from_high;
        !           515: 
        !           516: 10      while (*roots) forward(*roots++);
        !           517: 
        !           518: 11      { int *x = to_low;
        !           519: 12      while (x<to_ptr)
        !           520: 13      { int *p = x+1, descr = (*x), len=descr>>4;
        !           521: 14        if (string_or_bytearray(descr)) 
        !           522: 15              x += (len+3)/4 + 1;
        !           523: 16        else {x += len+1;
        !           524: 17              do {forward(p++);} while (p<x);
        !           525: 18      }      }
        !           526: 
        !           527: 19      *trap_pcx = trap_pc;
        !           528: 20    }
        !           529: \end{verbatim}
        !           530: \caption{The {\tt gc} function}
        !           531: \end{figure}
        !           532: The function \verb|gc| is parametrized by the lower and upper bounds
        !           533: of {\em fromspace} and the lower and upper bounds of {\em tospace}.  The \verb"roots"
        !           534: parameter is a vector of pointers to the roots of accessible objects;
        !           535: in the Standard ML system this is little more than the addresses of
        !           536: saved machine registers.  The last two parameters are the address
        !           537: of the saved program counter (necessary as explained in 
        !           538: section~\ref{forwarding})
        !           539: and the address of a function which can be called to expand the {\em tospace}
        !           540: if necessary.
        !           541: 
        !           542: The first step [lines 6--9] is to put various quantities into global
        !           543: variables to make them accessible to the \verb"forward" procedure,
        !           544: since C does not have nested procedures.
        !           545: 
        !           546: The next step is to forward all the root variables (line 10).
        !           547: 
        !           548: Finally [lines 11-18],
        !           549: we forward each word of each record in {\em tospace}.  Since the
        !           550: \verb|forward| procedure may increment the \verb"to_ptr" variable
        !           551: that denotes the end of the filled portion of {\em tospace}, we have to
        !           552: keep iterating until \verb|x| catches up with it.  In effect,
        !           553: the {\em tospace} between \verb|x| and \verb|to_ptr| is the queue of the
        !           554: breadth-first search.  The queue must eventually become empty,
        !           555: since there is a finite amount of accessible data to be copied.
        !           556: In this phase, integers (and strings) are just skipped, since
        !           557: they contain no pointers that need forwarding.
        !           558: 
        !           559: This garbage collector is relatively simple, and therefore it's not difficult
        !           560: to make it fast.  Even the fanciest of generational or concurrent collectors
        !           561: relies on a program like this as its inner loop; the very simple layout
        !           562: of data in the Standard ML runtime system makes efficiency easy.
        !           563: 
        !           564: \section{Fast allocation}
        !           565: \label{fast}
        !           566: 
        !           567: Copying garbage collection, because it takes time proportional only to the
        !           568: live data and not to the amount of garbage, can be arbitrarily 
        !           569: efficient\cite{appel87:gc}.  That is, there is no lower bound on the
        !           570: amortized cost of garbage collection for each cell allocated.
        !           571: Standard ML's generational copying garbage collector\cite{appel89:sggc}
        !           572: expends on the order of one (amortized)
        !           573: machine instruction for every cell allocated; the precise amortized
        !           574: cost is proportional to the ratio of live data to heap size.
        !           575: 
        !           576: Since garbage collection is so fast, it makes sense to make
        !           577: allocation\index{allocation}
        !           578: fast too.  Standard ML of New Jersey allocates a new record every
        !           579: 40 to 80 machine instructions, so we want a very low overhead on the
        !           580: creation of an object.
        !           581: 
        !           582: Since copying garbage collectors compact the live objects into consecutive
        !           583: memory cells, the free region is all contiguous.  This means that to create
        !           584: an $n$-word object, we can just grab the next $n$ words of the free space.
        !           585: Since objects are created so often, it makes sense to make allocation
        !           586: in-line (with no procedure call) and to reserve a register \verb"fr" to point
        !           587: to the beginning of the free region.  Then it becomes very simple
        !           588: to allocate a new object:  \verb"A = CONS(B,C)" is implement as
        !           589: \begin{enumerate}
        !           590: \item mem[fr+2] := C
        !           591: \item mem[fr+1] := B
        !           592: \item mem[fr]   := descriptor(2,tag\_record)
        !           593: \item A := fr+1
        !           594: \item fr := fr+3
        !           595: \end{enumerate}
        !           596: Each of these lines is one machine instruction.  Thus, in five instructions
        !           597: (plus one instruction of amortized garbage collection overhead)
        !           598: we have made a new {\em cons} cell; it takes only twice as long to create
        !           599: a data structure as it does to read it!  
        !           600: This fast allocation encourages a simple and clean programming style;
        !           601: no longer do programmers have to stand on their heads to avoid {\em cons}ing.
        !           602: 
        !           603: Of course, this won't work if the free region is exhausted.  We can insert
        !           604: an explicit test to make sure that \verb"fr" is not near the end of the
        !           605: free region.  But a more clever trick is to make the virtual memory page
        !           606: at the end of the free region inaccessible, so that we will get a page
        !           607: fault\index{fault}\index{page fault}
        !           608: when the free region is exhausted.  That's why we store the
        !           609: last field (\verb"mem[fr+2]") first in the example above; it's simpler
        !           610: for the garbage collector if the fault occurs at the very beginning
        !           611: of creating the new cell.
        !           612: (On the MC68020, the state of the machine at a page fault is complicated,
        !           613: and it's not easy to restart the faulting instruction; so on that machine
        !           614: we use an explicit comparison with a limit register.)
        !           615: 
        !           616: The page fault will be caught by the hardware and handed to the operating
        !           617: system, which can then pass control to the user process.  The user process
        !           618: then has to find all the registers of the faulting procedure; these
        !           619: registers are the roots of the accessible data.  Appel\cite{appel89:sggc}
        !           620: and Cormack\cite{cormack88} both describe schemes for finding these
        !           621: registers; Cormack's is simpler and more reliable.
        !           622: 
        !           623: The ML program and the garbage collector behave like two processes
        !           624: (threads\index{thread})
        !           625: running in the same address space: while one thread executes, the
        !           626: other's registers are saved.  When ML suspends itself (either
        !           627: because of a page fault indicating end of free space, or voluntarily),
        !           628: it saves its registers into a static area that looks just like a process
        !           629: control block, so that the garbage collector (and other parts of the
        !           630: runtime system) can access them.
        !           631: 
        !           632: \begin{quotation}
        !           633: {\small Here's how Cormack's scheme works:  A page fault arrives, causing
        !           634: the operating system to invoke the C function \verb"ghandle".
        !           635: \index{ghandle}
        !           636: This function is passed (as an argument) a structure containing
        !           637: the address of the faulting instruction.  Other saved registers are
        !           638: at undocumented locations on the stack.  \verb"ghandle" saves the
        !           639: \index{saved_pc}\index{pc}
        !           640: faulting pc in \verb"saved_pc", and modifies its argument structure
        !           641: to point to the assembly-language function \verb"saveregs".  Then it
        !           642: returns; the operating system restores the pc from the argument
        !           643: structure, restores other registers (from those undocumented locations),
        !           644: and resumes.  But of course, we have fooled the operating system into
        !           645: \index{saveregs}
        !           646: resuming in \verb"saveregs", which stores all registers into known
        !           647: global variables and returns to the C thread (i.e. in the function
        !           648: \verb"runML").
        !           649: 
        !           650: The C thread (typically) does garbage collection, then calls the
        !           651: assembly-language function \verb"restoreregs", which loads registers
        !           652: (and program counter) from their global variables and resumes execution
        !           653: of the ML thread.
        !           654: 
        !           655: Sometimes it's useful to invoke the C thread without a page fault,
        !           656: e.g. to export the process state into a file or to do a structured
        !           657: write.  In this case, \verb"saveregs" can be called directly from assembly
        !           658: code in the ML thread.
        !           659: }
        !           660: \end{quotation}
        !           661: \section{Heap allocation of procedure call frames}
        !           662: \label{heap}
        !           663: 
        !           664: Since heap allocation is so cheap, we put procedure call frames on the
        !           665: heap instead of on the stack\index{stack}.
        !           666: This has many advantages\cite{appel89:cps},
        !           667: but the one of interest here is that it greatly simplifies the runtime
        !           668: system.  Many of the facilities described in this paper
        !           669: would be much more complicated to implement if runtime stacks had to be
        !           670: dealt with.
        !           671: 
        !           672: Any record on the stack must be initialized as it is created;
        !           673: otherwise it will contain garbage data that could be interpreted
        !           674: by the garbage collector as spurious pointers.  However, typical
        !           675: code generators will allocate a call frame on
        !           676: entry to a procedure and spill registers into it as needed.
        !           677: This must be avoided.  One way to solve
        !           678: this problem is to delay allocation of the frame; values can be accumulated
        !           679: in registers until spilling is necessary, then all the registers can 
        !           680: be spilled at once into a newly-created frame, which is just a record
        !           681: object in runtime data format.  Thus, the frame is completely initialized
        !           682: as it is created.
        !           683: 
        !           684: The Standard ML of New Jersey compiler doesn't use ``procedure call frames.''
        !           685: Since it uses continuation\index{continuation}-passing 
        !           686: style\cite{steele78}\cite{kranz86}\cite{appel89:cps},
        !           687: what an ordinary mortal might call a ``frame'' is really just the
        !           688: closure of a continuation.  It is easy to create closures as ordinary
        !           689: records that are completely initialized when they are created.
        !           690: This simplifies both the code generator
        !           691: and the runtime system, though the trick described in the previous paragraph
        !           692: will work for more conventional code generators.
        !           693: 
        !           694: Since there's no runtime stack, the 
        !           695: {\em call-with-current-continuation}\cite{rees86}
        !           696: \index{call-with-current-continuation}
        !           697: primitive can be implemented very efficiently.  In implementations
        !           698: with a runtime stack, the entire stack must be copied when {\em call/cc}
        !           699: is evaluated (or else there must be a lot of extra complexity in stack
        !           700: management); without a stack, the execution of
        !           701: {\em call/cc}, and the execution of
        !           702: saved continuations, take just a few instructions each.  This makes
        !           703: {\em call/cc} a practical programming tool, just as fast allocation
        !           704: makes {\em cons} practical.
        !           705: 
        !           706: \section{Representing ML structures in records and strings}
        !           707: \label{representing}
        !           708: Section~\ref{data} describes just two kinds of 
        !           709: objects---records\index{record} and
        !           710: strings\index{string}---referenceable at their beginning and (in a limited way)
        !           711: at interior points.  All of the kinds of ML data can be represented
        !           712: in records and strings.
        !           713: 
        !           714: An ML value must be representable in one word.  A larger value can be {\em
        !           715: boxed} by putting it in several words on the heap and keeping a (one-word)
        !           716: pointer to it.  A small value (like a 31-bit integer)
        !           717: can be kept {\em unboxed\index{unboxed values}} by representing it without indirection in
        !           718: a machine word.
        !           719: Boxed\index{boxed values}
        !           720: values (pointers)
        !           721: are distinguished from unboxed values (integers, or data represented as
        !           722: integers)
        !           723: by their low-order bit.
        !           724: 
        !           725: An ML {\bf record\index{record}}
        !           726: is an $n$-tuple of values.  It has a natural representation
        !           727: as a record in our runtime data format.  ML has a {\it pro forma}
        !           728: record of length 0; as our runtime data format does not allow objects of length
        !           729: 0 (since that would leave no room to put a forwarding pointer, as described in
        !           730: a section~\ref{forwarding}),
        !           731: the empty record
        !           732: is represented as the unboxed integer 0.  This does no harm, since
        !           733: no fields can be selected from an empty record anyway.
        !           734: 
        !           735: An ML {\bf array\index{array}} is also an $n$-tuple of values.
        !           736: In the ML language, 
        !           737: record-field offsets are determined at compile-time, whereas
        !           738: arrays may be indexed by runtime values;
        !           739: and arrays may be modified after
        !           740: they are created, whereas records are immutable.
        !           741: But neither of these
        !           742: differences matters to the runtime system.
        !           743: The polymorphic equality function of ML requires distinguishing between
        !           744: mutable and immutable objects at runtime, so arrays and records
        !           745: have different tags, however; this requires that they have different
        !           746: tags.
        !           747: 
        !           748: ML does permit 
        !           749: arrays\index{arrays of length 0} of length 0,
        !           750: but our runtime data format does not.
        !           751: Happily, all arrays of length 0 have
        !           752: the same behaviour, so a special object of length 0 (located outside
        !           753: the garbage-collectible region) serves to represent all the
        !           754: empty arrays.
        !           755: 
        !           756: {\bf References} in ML are mutable objects:  
        !           757: \verb"val a = ref 5" declares a reference\index{reference variables}
        !           758: variable \verb"a" that may be
        !           759: changed by a later assignment statement, unlike most variables which cannot
        !           760: be modified once defined.  Reference variables behave just like single-element
        !           761: arrays, and that's how they are represented in the runtime system.
        !           762: 
        !           763: ML has {\bf datatype\index{datatype}}
        !           764: declarations to allow tagged variant types.  For example,
        !           765: the declaration
        !           766: \begin{verbatim}
        !           767: datatype t = NAME of string | NUMBER of int
        !           768: \end{verbatim}
        !           769: specifies a type \verb"t" that can be either a string or an integer,
        !           770: depending on whether the constructor\index{constructor} \verb"NAME"
        !           771: or \verb"NUMBER" is used to create it.
        !           772: The program can examine any object of type \verb"t" and determine whether
        !           773: it has the \verb"NAME" or \verb"NUMBER" representation---that is, it
        !           774: is a {\em tagged} union.  An object of type \verb"t" is represented
        !           775: as a two-element record: one field contains the tag (a small integer),
        !           776: and the other field contains the value.
        !           777: The details of constructor representation may be of interest
        !           778: only to those knowledgeable in ML, and are described in 
        !           779: section~\ref{constructor}.
        !           780: 
        !           781: ML has {\bf strings\index{string}}
        !           782: of characters.  Strings can be of any non-negative length.
        !           783: Since our runtime representation cannot support objects of length 0
        !           784: (because one word is needed to store the forwarding pointer, as described
        !           785: in a section~\ref{forwarding}),
        !           786: the empty string\index{empty string}\index{null string}
        !           787: \index{string of length 0}
        !           788: must be treated specially.  However, it turns out that
        !           789: it is never necessary to create a new empty string; the only occurrences
        !           790: of empty strings are as literals in the program text.  String
        !           791: literals (and back-pointers) will be discussed in section~\ref{function}.
        !           792: 
        !           793: Strings\index{Strings of length one} of length one are treated specially by the compiler,
        !           794: though this is not necessary either for the ML language or for our runtime
        !           795: data format.  Single-character strings\index{single-character strings}
        !           796: \index{characters}
        !           797: are treated as unboxed integers
        !           798: between 0 and 255, to avoid heap-allocation for this (frequent) special case.
        !           799: Though this technique may cause
        !           800: less allocation, it requires special tests on every
        !           801: string operation.  It's not clear whether this special case saves more than
        !           802: it costs.  However, this special case is transparent to the runtime system
        !           803: anyway; all conversions, etc. are handled explicitly by the compiler;
        !           804: the single-character strings are not a new runtime data format, but
        !           805: look like ordinary unboxed integers.
        !           806: 
        !           807: {\bf Byte-arrays\index{byte-arrays}}
        !           808: in ML are to strings as arrays are to records:  they have the
        !           809: same representation as strings, but their tag is different to facilitate
        !           810: certain language features.
        !           811: 
        !           812: {\bf Floating-point\index{floating-point}} numbers are too large to fit in 
        !           813: one word.  Even if we chose
        !           814: to use single-precision floating point, it would be difficult to store
        !           815: them unboxed\index{unboxed}
        !           816: because there is no bit available for use as a tag\index{tag} bit.
        !           817: Thus, all floating-point numbers are stored boxed, with 8 bytes of data and
        !           818: one word of descriptor.
        !           819: The descriptor\index{descriptor}
        !           820: must be one that indicates that the contents of the
        !           821: object contain no pointers.
        !           822: Since ML is statically typed, the language
        !           823: never needs to distinguish (at runtime) floating-point numbers from
        !           824: strings, so we can just represent a double-precision float as an 8-byte
        !           825: string.  This is not an ASCII string, it is the hardware representation;
        !           826: a ``string'' descriptor does not denote that the contents are printable
        !           827: characters, just that there is an arbitrary bit-pattern not containing
        !           828: pointers.
        !           829: 
        !           830: Similarly, {\bf machine-code procedures}
        !           831: \index{machine code} do not require a separate class
        !           832: of descriptor.  From the garbage collector's point of view, machine
        !           833: code is just like strings: it contains bits that are to be (perhaps)
        !           834: moved from place to place but which never represent pointers.
        !           835: Thus, machine code is just kept in string objects.  The mechanism by
        !           836: which we avoid placing pointers\index{pointers}
        !           837: (or any reference to other objects)
        !           838: in machine code will be described in section~\ref{modules}.
        !           839: 
        !           840: \section{Closures}
        !           841: \label{closures}
        !           842: \index{closure}
        !           843: In ML, as in many lexically scoped languages, functions may have free
        !           844: variables.  That is, the body of a function may reference not only its
        !           845: own formal parameters\index{formal parameters}
        !           846: (the bound variables of the function), but the
        !           847: formal parameters of a statically enclosing function.  Consider the
        !           848: function 
        !           849: \begin{verbatim}
        !           850:     fun add(x) = let fun add1(y) = x+y
        !           851:                   in add1
        !           852:                  end
        !           853: \end{verbatim}
        !           854: The function \verb"add(x)" defines an internal function \verb"add1(y)" that
        !           855: adds \verb"x" and \verb"y", and then \verb"add" returns \verb"add1" as its
        !           856: result.  The variable \verb"x" is bound by \verb"add", and
        !           857: \verb"y" is bound by \verb"add1".  Therefore, if we just consider the
        !           858: function definition \verb"fun add1(y) = x+y", the variable \verb"x"
        !           859: is a free variable\index{free variable} of this internal function.
        !           860: 
        !           861: In order to evaluate \verb"add1" applied to some argument, we must have
        !           862: a context in which the value \verb"x" is defined.  It will not suffice
        !           863: for the \verb"add" function just to return a pointer to the machine code
        !           864: of \verb"add1" as its result.  Instead, it must return a {\em closure}:
        !           865: a combination of a code-pointer an some information about the values of
        !           866: all the free variables (in this case, just \verb"x").
        !           867: 
        !           868: For example, the result of evaluating \verb"add(5)" is (abstractly) the
        !           869: function which will always add 5 to its argument.  
        !           870: Figure~\ref{add5} shows a possible representation
        !           871: as a record object, with a descriptor at  the beginning for the
        !           872: garbage collector.
        !           873: \begin{figure}[htbp]
        !           874: \label{add5}
        !           875: \begin{verbatim}
        !           876:      +------+
        !           877:      | desc |
        !           878:      +------+
        !           879: p--> |  o------------> machine code for add1
        !           880:      +------+
        !           881:      |  5   |
        !           882:      +------+
        !           883: \end{verbatim}
        !           884: \caption{A simple closure.}
        !           885: \end{figure}
        !           886: The pointer \verb"p" is the runtime representation of the particular version
        !           887: of \verb"add1" in which \verb|x| is bound to 5.  This pointer may be passed
        !           888: as an argument to some other function that doesn't know anything about
        !           889: the \verb"add1" function.  All the other function has to know is how
        !           890: to ``call'' a closure.  The calling sequence might be:
        !           891: \begin{enumerate}
        !           892: \item Put \verb|p| in register 0.
        !           893: \item Put the return address in register 1.
        !           894: \item Fetch p[0] into register 2.
        !           895: \item Jump to the address contained in register 2.
        !           896: \end{enumerate}
        !           897: Step 1 is necessary so that when \verb"add1" is entered, it can gain
        !           898: access to the closure record where the binding of \verb"x" is stored.
        !           899: Step 2 is just the normal saving of a return address, which might
        !           900: in many cases be on a stack instead of in a register.  Steps 3 and
        !           901: 4 are just the typical code necessary to jump to a runtime-bound address.
        !           902: 
        !           903: It is important to note that the function that calls a closure \verb"p"
        !           904: doesn't need to access any part of it except the machine-code pointer
        !           905: in \verb|p[0]|.  This means that the free variables (the rest of the
        !           906: closure) may have an arbitrary arrangement, known only to the
        !           907: function that builds the closure (\verb"add", in this case) and the
        !           908: function that executes it (\verb"add1", in this case).  Closures can
        !           909: be structured in many ways\cite{appel88:clo}, and can even point
        !           910: to other closures for access to free variables (i.e., they can
        !           911: have ``static links\index{static link},'' in Algol terminology).
        !           912: 
        !           913: Clearly, closures are represented a lot like records.  A closure has several
        !           914: fields, each of which is either a machine-code pointer or a free variable.
        !           915: The machine-code pointer is just a (boxed) string pointer,
        !           916: and the free variables are
        !           917: just one-word ML objects.  Therefore, we use record\index{record}
        !           918: objects to represent closures.
        !           919: 
        !           920: When several functions share a set of free variables, it is convenient
        !           921: to have a {\bf multiple-function closure.}
        !           922: \index{closure for multiple functions}
        !           923: \index{multiple functions closure}
        !           924: \index{shared closure}
        !           925: Consider the functions
        !           926: \begin{verbatim}
        !           927: fun test(y,z) =
        !           928:     let fun even(i) = if i=0 then y else odd(i-1)
        !           929:         and odd(i)  = if i=0 then z else even(i-1)
        !           930:      in (even,odd)
        !           931:     end
        !           932: \end{verbatim}
        !           933: This function, when applied to arguments \verb"(3,7)", would return
        !           934: two functions (let's call them \verb"even37" and \verb"odd37").
        !           935: from integer to integer.  \verb"Even37" applied to even integers
        !           936: returns 3 and applied to odd integers returns 7, and \verb"odd37"
        !           937: works the other way around.
        !           938: 
        !           939: We could represent these functions by two different closures, as shown
        !           940: in figure~\ref{twoclos}.
        !           941: \begin{figure}[htbp]
        !           942: \label{twoclos}
        !           943: \begin{verbatim}
        !           944: 
        !           945:            +------+
        !           946:            | desc |
        !           947:            +------+
        !           948: even37 --> |  o------------> machine code for even
        !           949:            +------+
        !           950:          y |  3   |
        !           951:            +------+
        !           952:        odd |  o-------> odd37
        !           953:            +------+
        !           954: 
        !           955:            +------+
        !           956:            | desc |
        !           957:            +------+
        !           958:  odd37 --> |  o------------> machine code for odd
        !           959:            +------+
        !           960:          z |  7   |
        !           961:            +------+
        !           962:       even |  o--------> even37
        !           963:            +------+
        !           964: \end{verbatim}
        !           965: \caption{Two mutually recursive closures}
        !           966: \end{figure}
        !           967: This is particularly unfortunate because \verb"even" and \verb"odd"
        !           968: are free variables\index{free variables}
        !           969: of each other, and this requires the construction
        !           970: of a cyclic data structure of closures, which can get complicated.
        !           971: 
        !           972: A more clever trick\cite{appel87:sml}\cite{kranz86} is to let these
        !           973: two functions share a closure, as shown in figure~\ref{oneclos}.
        !           974: \begin{figure}[htbp]
        !           975: \label{oneclos}
        !           976: \begin{verbatim}
        !           977:            +------+
        !           978:            | desc |
        !           979:            +------+
        !           980: even37 --> |  o------------> machine code for even
        !           981:            +------+
        !           982:  odd37 --> |  o------------> machine code for odd
        !           983:            +------+
        !           984:          y |  3   |
        !           985:            +------+
        !           986:          z |  7   |
        !           987:            +------+
        !           988: \end{verbatim}
        !           989: \caption{Two functions sharing a closure}
        !           990: \end{figure}
        !           991: Now, when \verb"even37" is called, its closure-pointer will be loaded
        !           992: in register 0, and it knows that it can access \verb:y: at offset 2 from
        !           993: register 0.  And when \verb"odd37" is called, its closure pointer
        !           994: (which is really a pointer to the second field of the record) will be
        !           995: loaded in register 0, and it can access \verb"z" at offset 2.
        !           996: In some situations, the same free variable may be accessed by different
        !           997: functions in the closure, and the code generator will have to remember
        !           998: that the offsets\index{offset} from register 0 to a particular field
        !           999: depend on which closure-pointer is loaded (i.e. which function we are
        !          1000: generating code for).
        !          1001: 
        !          1002: Finally, if \verb"even" wants to call \verb"odd", it can just generate
        !          1003: the appropriate closure value in register 0 by adding an offset of 1
        !          1004: to its own closure pointer.  In general, all mutually recursive
        !          1005: functions can be handled this way, and cyclic\index{cyclic closure}
        !          1006: closure structures
        !          1007: are never required.  But even non-mutually-recursive sets of functions
        !          1008: can save storage by sharing closures.
        !          1009: 
        !          1010: It should now be clear why (in section~\ref{data}) we arranged for
        !          1011: pointers into the middle of record objects.  Given the pointer
        !          1012: \verb"odd37", the garbage collector can easily search backward until
        !          1013: it finds the descriptor\index{descriptor}
        !          1014: of the record; the descriptor is unboxed,
        !          1015: while all of the previous machine-code pointers are boxed fields.
        !          1016: 
        !          1017: \section{Function entry points}
        !          1018: \label{function}
        !          1019: 
        !          1020: A typical compilation\index{compilation unit} unit
        !          1021: will contain many functions, and it would be
        !          1022: unwieldy to make a different string object for each function:
        !          1023: since the functions in a compilation unit often call each other,
        !          1024: these calls will go much faster as pc-relative jumps than if
        !          1025: we had to fetch addresses from closures.  But to achieve pc-relative
        !          1026: jumps, the relative distance between two functions cannot change;
        !          1027: and the garbage collector moves objects around.
        !          1028: So to achieve
        !          1029: this constant distance, we must put several functions in the
        !          1030: same string object, and have several
        !          1031: entry points.  
        !          1032: 
        !          1033: For example, the \verb"even", \verb"odd", and \verb"test"
        !          1034: functions of section~\ref{closures} could all be placed a single string,
        !          1035: \index{embedded string}
        !          1036: with appropriate back-pointer descriptors (figure~\ref{embedded1}).
        !          1037: \begin{figure}[htbp]
        !          1038: \label{embedded1}
        !          1039: \begin{verbatim}
        !          1040:            +------+
        !          1041:            | desc |
        !          1042:            +------+
        !          1043:   test --> |  t   |
        !          1044:            +  e   +
        !          1045:            |  s   |
        !          1046:            +  t   +
        !          1047:            |      |
        !          1048:            +------+
        !          1049:            | [5]  |
        !          1050:            +------+
        !          1051:   even --> |  e   |
        !          1052:            +  v   +
        !          1053:            |  e   |
        !          1054:            +  n   +
        !          1055:            |      |
        !          1056:            +------+
        !          1057:            | [10] |
        !          1058:            +------+
        !          1059:    odd --> |  o   |
        !          1060:            +  d   +
        !          1061:            |  d   |
        !          1062:            +------+
        !          1063: \end{verbatim}
        !          1064: \caption{Three code strings embedded in a string}
        !          1065: \end{figure}
        !          1066: The numbers \verb"[5]" and \verb"[10]" are back-pointers, formatted
        !          1067: as described in section~\ref{data}.  They tell the garbage collector
        !          1068: the offset to the beginning of the object, so it can find the true descriptor.
        !          1069: 
        !          1070: The advantage to putting several functions in the same string is that
        !          1071: they can refer to each other directly, without having to access each
        !          1072: other through closures.  Of course, it will be necessary to adjust
        !          1073: the closure\index{closure}
        !          1074: pointer in register 0.  Here, \verb"even" could call \verb"odd"
        !          1075: by this (simpler) calling sequence:
        !          1076: \begin{enumerate}
        !          1077: \item Add 1 to register 0  (to adjust p).
        !          1078: \item Put the return address in register 1.
        !          1079: \item Jump to \verb"odd".
        !          1080: \end{enumerate}
        !          1081: 
        !          1082: The garbage collector may move a code string from one place to another
        !          1083: in memory, so it is necessary that the jump in step 3 be a 
        !          1084: pc\index{pc}-relative
        !          1085: jump.
        !          1086: 
        !          1087: ML programs may contain literal\index{literal}
        !          1088: \index{string literal}
        !          1089: strings and floating\index{floating point constant} point constants.
        !          1090: These are embedded within string objects amongst the machine-code
        !          1091: functions.  For example,
        !          1092: \begin{verbatim}
        !          1093: fun show(b) = if b then "true" else "false"
        !          1094: \end{verbatim}
        !          1095: could almost have the representation shown in figure~\ref{embedded2}.
        !          1096: \begin{figure}[htbp]
        !          1097: \label{embedded2}.
        !          1098: \begin{verbatim}
        !          1099:            +------+
        !          1100:            | desc |
        !          1101:            +------+
        !          1102:   show --> |  s   |
        !          1103:            +  h   +
        !          1104:            |  o   |
        !          1105:            +  w   +
        !          1106:            |      |
        !          1107:            +------+
        !          1108:            | [5]  |
        !          1109:            +------+
        !          1110:   true --> | true |
        !          1111:            +------+
        !          1112:            | [7]  |
        !          1113:            +------+
        !          1114:  false --> | fals |
        !          1115:            +      +
        !          1116:            | e000 |
        !          1117:            +------+
        !          1118: \end{verbatim}
        !          1119: \caption{String literals, oversimplified}
        !          1120: \end{figure}
        !          1121: The function \verb"show" returns a pointer to \verb"true" or \verb"false";
        !          1122: this pointer points into the middle of the large string object at one
        !          1123: of the embedded strings.  Because the embedded string is preceded by a
        !          1124: back-pointer descriptor, the garbage collector won't be confused.
        !          1125: 
        !          1126: There's a slight problem with this.  Although the garbage collector doesn't
        !          1127: need to know the length of the string \verb|"true"|, the ML program might
        !          1128: apply the \verb"length" function to it.  
        !          1129: String\index{string length} lengths are kept in the
        !          1130: descriptor word, and are accessed by fetching field $-1$ and shifting
        !          1131: off the tag bits.  The problem here is that the descriptor for \verb"true"
        !          1132: is a back-pointer\index{back-pointer},
        !          1133: not a string descriptor.  The solution---simple and
        !          1134: inelegant---is to introduce a new kind of descriptor, an {\em embedded
        !          1135: string}, which contains the length of the string and always is preceded
        !          1136: by a backpointer.  Thus, the actual representation of the \verb"show"
        !          1137: function is more like figure~\ref{embedded3}.
        !          1138: \begin{figure}[htbp]
        !          1139: \label{embedded3}
        !          1140: \begin{verbatim}
        !          1141:            +------+
        !          1142:            | desc |
        !          1143:            +------+
        !          1144:   show --> |  s   |
        !          1145:            +  h   +
        !          1146:            |  o   |
        !          1147:            +  w   +
        !          1148:            |      |
        !          1149:            +------+
        !          1150:            | [6]  |
        !          1151:            +------+
        !          1152:            | 4emb |
        !          1153:            +------+
        !          1154:   true --> | true |
        !          1155:            +------+
        !          1156:            | [9]  |
        !          1157:            +------+
        !          1158:            | 5emb |
        !          1159:            +------+
        !          1160:  false --> | fals |
        !          1161:            +      +
        !          1162:            | e000 |
        !          1163:            +------+
        !          1164: \end{verbatim}
        !          1165: \caption{String literals embedded in a code string}
        !          1166: \end{figure}
        !          1167: 
        !          1168: The treatment of embedded \index{floating point constant}
        !          1169: floating point constants is just like that of
        !          1170: embedded strings.  However, the ML program will never need to take the
        !          1171: length of a floating point constant, so the embedded-string descriptor
        !          1172: can be omitted and a simple backpointer\index{backpointer} will suffice.
        !          1173: Note that embedded strings of length 0 are permitted, since there
        !          1174: is no need for a word to install a forwarding pointer (see 
        !          1175: section~\ref{forwarding},
        !          1176: and note that forwarding pointers go only at the beginning of the
        !          1177: entire string object, not at embedded strings).
        !          1178: 
        !          1179: The address of string and floating constants may appear in machine code,
        !          1180: but only in a pc\index{pc}-relative way, since code objects may be moved in memory
        !          1181: by the garbage collector\index{garbage collector}.
        !          1182: 
        !          1183: \section{Modules and compilation units}
        !          1184: \label{modules}
        !          1185: 
        !          1186: Standard ML has a powerful module\index{module}
        !          1187: system, allowing nested modules,
        !          1188: modules as parameters to other modules, thinning of a richer module
        !          1189: to form a smaller module, etc.  The module system can be represented
        !          1190: entirely as records and functions in the runtime system\cite{appel87:sml};
        !          1191: no new machinery is needed.
        !          1192: 
        !          1193: A compilation unit\index{compilation unit}
        !          1194: is just a sequence of function and value declarations that
        !          1195: will be compiled together into one string object.  But one compilation
        !          1196: unit may refer to values defined in another, and a linkage\index{linkage}
        !          1197: mechanism is required.
        !          1198: 
        !          1199: We can use the power of higher-order functions to implement the linkage
        !          1200: in a way that is transparent to the runtime system.  If unit \verb"B"
        !          1201: refers to a value in unit \verb"A", then the compiler will treat
        !          1202: \verb"A" as an implicit parameter of \verb"B".  For example, suppose we have
        !          1203: two compilation units
        !          1204: \begin{verbatim}
        !          1205: fun f(x,y) = x+y
        !          1206: \end{verbatim}
        !          1207: and
        !          1208: \begin{verbatim}
        !          1209: fun g(z) = f(z,z)
        !          1210: \end{verbatim}
        !          1211: clearly \verb"f" is a free variable\index{free variable}
        !          1212: of \verb"g".  But the compiler can
        !          1213: parametrize the second unit, as if it were written
        !          1214: \begin{verbatim}
        !          1215: fun g0(f) = let fun g(z) = f(z,z)
        !          1216:             in g
        !          1217:            end
        !          1218: \end{verbatim}
        !          1219: This compilation unit has no free variables; the ML system will simply
        !          1220: apply the second unit to the first one, yielding the desired function
        !          1221: \verb"g".  The compiler must keep track of inter-module references
        !          1222: in order to do this, but at least the runtime mechanism for linkage is simple.
        !          1223: 
        !          1224: So, each compilation unit is a closed function with no free variables.
        !          1225: No special linkage mechanism is built into the runtime system; the
        !          1226: closure\index{closure}
        !          1227: mechanism handles linkage very elegantly.
        !          1228: 
        !          1229: Eliminating free variables from compilation units also simplifies the
        !          1230: code generator\index{code generator}.
        !          1231: Whenever the code generator must analyze local free variables,
        !          1232: generate pc\index{pc}-relative references, etc.,
        !          1233: its job is much simpler because there
        !          1234: are no global references to other objects.  The interface between the
        !          1235: front end of the compiler and the code generator are much cleaner as a result.
        !          1236: 
        !          1237: \section{Linkage to assembly language}
        !          1238: \label{linkage}
        !          1239: 
        !          1240: Some primitives of a programming language are best implemented in a different
        !          1241: language (typically assembly language\index{assembly language}).
        !          1242: Standard ML of New Jersey
        !          1243: makes use of 8 functions written in assembly language, and 8 functions written
        !          1244: in C\index{C}.
        !          1245: (For comparison, more than 100 standard-library\index{standard library}
        !          1246: functions are written
        !          1247: in ML.)  The assembly-language functions are:
        !          1248: \begin{enumerate}
        !          1249: \item {\bf array\index{array}(n,x)} creates an array of length $n$, each element initialized
        !          1250: to $x$.
        !          1251: \item {\bf callc\index{callc}(f,a)} calls a C-language function $f$ with argument $a$.
        !          1252: \item {\bf create\_b(n)} creates\index{create\_b} a byte-array of length $n$.
        !          1253: \item {\bf create\_s(n)} creates\index{create\_s} a string of length $n$.
        !          1254: \item {\bf floor\index{floor}(x)} returns the smallest integer less than
        !          1255: or equal to $x$.
        !          1256: \item {\bf logb\index{logb}(x)} returns the exponent part of (the floating-point) $x$.
        !          1257: \item {\bf scalb\index{scalb}(x)} inserts a new exponent into (floating-point) $x$.
        !          1258: \item {\bf syscall\index{syscall}(i,args,k)} does operating system
        !          1259: \index{operating system} kernel-call $i$ with
        !          1260: $k$ arguments.
        !          1261: \end{enumerate}
        !          1262: The assembly-language functions are located at (constant) addresses within
        !          1263: the runtime system.  But it's a good idea to follow the rules about
        !          1264: free variables\index{free variables}
        !          1265: in module-linkages\index{linkage}: even the references to assembly-language
        !          1266: primitives should not be ``hard-wired'' addresses in code objects.
        !          1267: If this were done, then all ML code would have to be re-compiled whenever
        !          1268: the runtime system was re-compiled.
        !          1269: 
        !          1270: Instead, we just treat the assembly-language functions as a special
        !          1271: record object containing nine elements.  Other modules are parametrized
        !          1272: by this \verb"Assembly" module just as if it were an ordinary one.
        !          1273: All the assembly-language functions follow the ML calling conventions.
        !          1274: 
        !          1275: Most of the C functions are to provide access to system calls with
        !          1276: hard-to-manage interfaces, like \verb"fork", etc.
        !          1277: There is an added complication that the C and ML calling conventions
        !          1278: don't match.
        !          1279: The assembly-language function \verb"callc" arranges the arguments
        !          1280: for an ML function to call a C function.  The details are uninteresting,
        !          1281: but the point is that absolute references are again avoided.
        !          1282: 
        !          1283: \begin{quotation}
        !          1284: {\small The file \verb"boot/assembly.sig" contains the signature
        !          1285: \verb"Assembly", approximately as in figure~\ref{assembly}.
        !          1286: \begin{figure}
        !          1287: \label{assembly}
        !          1288: \begin{verbatim}
        !          1289: signature ASSEMBLY =
        !          1290:   sig
        !          1291:       datatype datalist = DATANIL | DATACONS of (string * string * datalist)
        !          1292:       type func
        !          1293:       datatype funclist = FUNC of (func * string * funclist)
        !          1294:       type object
        !          1295:       structure A : sig
        !          1296:         val array : int * 'a -> 'a array
        !          1297:         val callc : 'b (* func*)  * 'a -> 'c
        !          1298:         val create_b : int -> string
        !          1299:         val create_s : int -> string
        !          1300:         val floor : real -> int
        !          1301:         val logb : real -> int
        !          1302:         val scalb : real * int -> real
        !          1303:         val syscall : int * string list * int -> int
        !          1304:       end
        !          1305:       exception Div
        !          1306:       exception Float of string
        !          1307:       exception Interrupt
        !          1308:       exception Overflow
        !          1309:       exception SystemCall of string
        !          1310:       exception UnboundTable
        !          1311:       val array0 : 'a array
        !          1312:       val bytearray0 : string
        !          1313:       val collected : int ref
        !          1314:       val collectedfrom : int ref
        !          1315:       val current : string ref
        !          1316:       val datalist : datalist
        !          1317:       val external : funclist
        !          1318:       val gcmessages : int ref
        !          1319:       val gcprof : string ref
        !          1320:       val majorcollections : int ref
        !          1321:       val minorcollections : int ref
        !          1322:       val opsys : int   (* 1 = vax bsd ultrix, 4.2, 4.3
        !          1323:                            2 = sunos 3.0, 4.0 
        !          1324:                            3 = vax v9 (bell labs) *)
        !          1325:       val pstruct : object ref
        !          1326:       val ratio : int ref
        !          1327: end
        !          1328: \end{verbatim}
        !          1329: \caption{The {\tt ASSEMBLY} signature}
        !          1330: \end{figure}
        !          1331: \index{Assembly structure}
        !          1332: The substructure \verb"A" is the machine-dependent part (implemented
        !          1333: in assembly language), and the other components (exceptions, constants,
        !          1334: etc.) are just data structures that can be described machine-independently
        !          1335: in C.  
        !          1336: \verb"A" is arranged as an ML record 
        !          1337: at the label \verb"runvec"\index{runvec}
        !          1338: in the file \verb"VAX.prim.s", \verb"M68.prim.s",
        !          1339: etc.  The machine-independent part
        !          1340: is in \verb"cstruct.c"\index{cstruct.c}.
        !          1341: 
        !          1342: The exceptions in this structure are just those that can be raised
        !          1343: from C or assembly language.  Exceptions as elaborated by the compiler
        !          1344: look like references to strings, and in \verb"cstruct.c" are implemented
        !          1345: as initialized structures to mimic this
        !          1346: with all the appropriate descriptors.
        !          1347: 
        !          1348: The values \verb"array0" and \verb"bytearray0" are just the array and
        !          1349: \index{array0}\index{array of length 0}
        !          1350: \index{bytearray0}
        !          1351: byte-array of length 0; this is necessary because objects of length 0
        !          1352: are not permitted in the garbage-collectible region.
        !          1353: 
        !          1354: \index{collected}\index{collectedfrom}
        !          1355: \index{majorcollections}\index{minorcollections}\index{ratio}
        !          1356: The references \verb"collected", \verb"collectedfrom",
        !          1357: \verb"majorcollections", and \verb"minorcollections" are updated by
        !          1358: the garbage collector with performance information.  The \verb"ratio"
        !          1359: variable can be set from ML to tell the garbage collector what
        !          1360: ratio to maintain between heap size and amount of live data.
        !          1361: 
        !          1362: \index{current}\index{profiling}
        !          1363: The variable \verb"current" is used in execution profiling\cite{appel88:prof};
        !          1364: it is a pointer
        !          1365: to  an array of length 2 of ML integers.  When profiling is enabled,
        !          1366: a timer interrupt will periodically
        !          1367: cause current[1] to be incremented.
        !          1368: 
        !          1369: \index{pstruct}
        !          1370: \index{loader}
        !          1371: \index{Initial}
        !          1372: The \verb"pstruct" is a pointer to the \verb"Initial" (pervasive)
        !          1373: structure.  The ML loader (\verb"boot/loader.sml") needs this
        !          1374: because other modules may reference the structure \verb"Initial".
        !          1375: 
        !          1376: \index{datalist}
        !          1377: The \verb"datalist" is used for linkage to sharable compiled ML code.
        !          1378: In vanilla Unix, such code must be link-loaded into the text segment
        !          1379: of the sml executable file, and the \verb"datalist" provides the ML
        !          1380: loader a way to get to it.  Each element of the data list contains
        !          1381: two strings (the name of the module, and the executable code), along
        !          1382: with a link to the next element.
        !          1383: 
        !          1384: \index{funclist}
        !          1385: The \verb"funclist" is similarly used for linkage to functions implemented
        !          1386: in C.  These functions could be described in the Assembly signature
        !          1387: but this requires recompilation whenever a new function is added.
        !          1388: The funclist is terminated by a function whose name is "xxxx".  All functions
        !          1389: in the list must have a name of exactly 4 characters.
        !          1390: All C-functions must take exactly one argument in ML format and return
        !          1391: a result in ML format.  The functions implemented in C at this writing are:
        !          1392: \begin{itemize}
        !          1393: \item{\bf fion\index{fion}} tells how many characters may be read from an open
        !          1394: file-descriptor without blocking.
        !          1395: \item{\bf fork\index{fork}} does a Unix process fork.
        !          1396: \item{\bf prof\index{prof}} is obsolete.
        !          1397: \item{\bf syst\index{syst}} executes a shell command as a sub-process.
        !          1398: \item{\bf time\index{time}} gets execution-time information from the operating system
        !          1399: and the garbage collector.
        !          1400: \item{\bf argv\index{argv}} gets the command-line arguments of the executing program
        !          1401: (as a list of strings).
        !          1402: \item{\bf envi\index{envi}} gets the Unix environment\index{environment}
        !          1403: string (as a list of strings).
        !          1404: \item{\bf blas\index{blas}} does a structured\index{structured write} write (as described section~\ref{structured}).
        !          1405: \item{\bf salb\index{salb}} does a structured read.
        !          1406: \end{itemize}
        !          1407: 
        !          1408: These functions should be careful not to allocate memory using \verb"malloc".
        !          1409: \index{malloc}
        !          1410: 
        !          1411: All of these functions are called from the ML ``thread'' (see 
        !          1412: section~\ref{fast}).
        !          1413: Some C functions may require the (temporary) suspension of the ML thread,
        !          1414: e.g. to do a garbage collection for a structured write.
        !          1415: By setting the global variable \verb"cause"\index{cause} to a nonzero
        !          1416: value and then returning, a function may cause control to be grabbed
        !          1417: immediately thereafter in the thread\index{thread} controller.
        !          1418: }
        !          1419: \end{quotation}
        !          1420: 
        !          1421: \section{Access to operating system services}
        !          1422: \label{access}
        !          1423: 
        !          1424: A typical operating system\index{operating system}
        !          1425: provides many services, each with one or
        !          1426: more operations (``system calls'').  A very early version of Standard ML of
        !          1427: New Jersey made little use of these services, so it sufficed to write
        !          1428: an assembly-language interface to each desired system call, and then
        !          1429: call the assembly-language functions from ML programs.  As the ML environment
        !          1430: grew more sophisticated, it needed more operating system services, so the
        !          1431: number of assembly-language interface functions grew.  This eventually
        !          1432: became intolerable.
        !          1433: 
        !          1434: Now there is just one assembly-language interface function, \verb"syscall",
        !          1435: \index{syscall}
        !          1436: that takes a list of arguments and a system-call number.  This function
        !          1437: pushes the arguments onto the stack and makes the system call.  The ML standard
        !          1438: library hides the \verb"syscall" function behind a variety of typechecked
        !          1439: functions.  Implementing these functions in ML rather than assembly language
        !          1440: is better for two reasons: ML is safer and easier to program in, and (more
        !          1441: important) the ML code doesn't need to be rewritten for each target
        !          1442: architecture.
        !          1443: 
        !          1444: \begin{quotation}
        !          1445: {\small All Unix system calls return integer values.  These are
        !          1446: converted into ML integers (by shifting and incrementing) and returned
        !          1447: as the result of \verb"syscall".  Many system calls take parameters
        !          1448: by reference that they stuff results into.  These results are (of course)
        !          1449: not appropriately tagged for ML.  A byte-array (of the
        !          1450: appropriate length) is used 
        !          1451: as the actual parameter for such an argument, and then
        !          1452: the results can be extracted from the byte-array after the system call.
        !          1453: 
        !          1454: Integer arguments to system calls are shifted right by \verb"syscall"
        !          1455: to convert them into their (untagged) representation.
        !          1456: 
        !          1457: String and byte-array arguments are left alone, yielding their natural
        !          1458: representation.  However, strings\index{string}
        !          1459: in C (and Unix) are terminated
        !          1460: by a null character, whereas strings in ML have a length prefix.
        !          1461: The prefix is ignored by the kernel, since it occurs at offset -4 (bytes).
        !          1462: To achieve null termination of strings, you may take the actual ML argument
        !          1463: and concatenate the string \verb|"\000\000"| of two zero characters (two
        !          1464: characters are required because of the unboxed representation of
        !          1465: single-character strings, and the possibility that the actual argument
        !          1466: is an empty string).
        !          1467: }
        !          1468: \end{quotation}
        !          1469: \section{Constructor representations}
        !          1470: \label{constructor}
        !          1471: Those unfamiliar with ML might wish to skip this section.
        !          1472: 
        !          1473: Some of the constructors\index{constructor}
        !          1474: in an ML datatype\index{datatype} may not carry values.
        !          1475: For example, in the type
        !          1476: \begin{verbatim}
        !          1477: datatype 'a option = NONE | SOME of 'a
        !          1478: \end{verbatim}
        !          1479: the \verb"SOME" constructor carries value of type \verb"'a"
        !          1480: and the
        !          1481: \verb"NONE" constructor carries no value.  Constructors that carry no value
        !          1482: are represented not as two-element records, but as (unboxed) small integers.
        !          1483: In this case, \verb"NONE"
        !          1484: is represented by the integer 0 (with a low-order 1 tag bit),
        !          1485: and \verb"SOME(x)" is represented by a two-element record containing the value
        !          1486: \verb"x" and a small integer representing the \verb"SOME" constructor.
        !          1487: The ML program can distinguish which constructor has been used by
        !          1488: testing for ``boxity:''\index{boxity}
        !          1489:  \verb"NONE" is an unboxed value\index{unboxed value}, \verb"SOME(x)"
        !          1490: is a boxed value\index{boxed value},
        !          1491: and the low-order bit will distinguish them.
        !          1492: 
        !          1493: Finally, in the case that there is only one value-carrying constructor,
        !          1494: and the value carried always has a boxed representation, the extra
        !          1495: indirection record can be eliminated.  Thus, for the {\bf list} datatype
        !          1496: \begin{verbatim}
        !          1497: datatype 'a list = NIL | CONS of ('a * 'a list)
        !          1498: \end{verbatim}
        !          1499: the constructor \verb"NIL" can be represented as the unboxed integer 0, the
        !          1500: constructor \verb"CONS(a,b)" can be represented as a two-element record
        !          1501: containing \verb"a" and \verb"b".  Again, the low-order ``boxity'' bit
        !          1502: distinguishes the constructors.  If there had been several value-carrying
        !          1503: constructors, then the boxity bit alone could not distinguish them,
        !          1504: and an indirection record (containing the carried value and a small
        !          1505: integer denoting the constructor) would be required.
        !          1506: 
        !          1507: It is tempting to play more elaborate tricks with the representation of
        !          1508: constructors.  For example, the {\em option} datatype described above
        !          1509: might not seem to need an extra indirection; surely the boxity bit
        !          1510: should be enough to distinguish between \verb"NONE" and \verb"SOME(x)"?
        !          1511: The problem is that the value \verb"x" might itself be either boxed or
        !          1512: unboxed; the indirection guarantees that \verb"SOME(x)" will indeed be boxed.
        !          1513: 
        !          1514: Similarly, it is tempting to use a clever representation for the datatype
        !          1515: \begin{verbatim}
        !          1516: type t = a * b
        !          1517: datatype atom = NUMBER of int | RECORD of t
        !          1518: \end{verbatim}
        !          1519: Here, the values carried by the NUMBER constructor are always unboxed, and
        !          1520: the values carried by the RECORD constructor are always boxed; the
        !          1521: boxity can serve to distinguish them.
        !          1522: 
        !          1523: The problem with these schemes is caused by the abstraction provided by
        !          1524: ML {\em functors\index{functor}}:
        !          1525: a functor might take the atom datatype as a parameter
        !          1526: with some of its structure hidden:
        !          1527: \begin{verbatim}
        !          1528: functor F(sig type t
        !          1529:               datatype atom = NUMBER of int | RECORD of t
        !          1530:           end)
        !          1531: \end{verbatim}
        !          1532: and here it is not at all clear that \verb"t" is always boxed.
        !          1533: (Unfortunately, this problem applies to the ``clever'' representation of the {\em list}
        !          1534: datatype as well, but we have chosen to ignore it there---we can
        !          1535: detect the (rare) problems with partially-abstracted lists
        !          1536: as functor parameters and give error messages, but we didn't want
        !          1537: to use an extra indirection in the representation of lists.)
        !          1538: The problem of functor/datatype interaction 
        !          1539: could be considered a defect in the language design.
        !          1540: 
        !          1541: \begin{quotation}
        !          1542: {\small Note that the constructor tag is element 1 of the record and
        !          1543: the value is element 0.  There's no particular reason for this; it's
        !          1544: probably a bad idea.  For exception constructors, the constructor
        !          1545: tag is not a small integer, it is a \verb"string ref".  A ref cell is
        !          1546: used instead of just a string so that (generative)
        !          1547: exceptions may be compared
        !          1548: for equality in pattern-matching exception\index{exception} constructors.
        !          1549: }
        !          1550: \end{quotation}
        !          1551: \section{Suspending a process}
        !          1552: \label{suspending}
        !          1553: \index{exportML}
        !          1554: Having designed a simple layout of runtime objects, we can now implement
        !          1555: easily a variety of runtime services.
        !          1556: 
        !          1557: A convenient feature of a programming environment is the ability to save
        !          1558: the current state of the computation in a file, so that by executing
        !          1559: that file on a later date the computation can be resumed.  In an interactive
        !          1560: system, one might wish to compile and load some programs, and then 
        !          1561: ``save the world'' so that in the next session these programs don't
        !          1562: have to be re-loaded.  This feature could also be useful for program
        !          1563: checkpoints.
        !          1564: 
        !          1565: It's not too hard to suspend a process in this way.  It's even possible
        !          1566: to make the resulting
        !          1567: file an ordinary executable file.  An executable file
        !          1568: contains a header, a text segment, and a data segment.  The (read-only)
        !          1569: text segment
        !          1570: of the saved file will be identical to the text segment of the currently
        !          1571: executing process, so a single \verb"write" system call will serve to
        !          1572: write it to the file.  The data segment of the file will consist of the
        !          1573: original data segment of the process plus any newly allocated heap memory;
        !          1574: again, one or two \verb"write"s will put it neatly into the file.
        !          1575: (It is useful to do a garbage collection immediately before saving,
        !          1576: to minimize the size of the file.)
        !          1577: Finally, the header can be synthesized appropriately.
        !          1578: 
        !          1579: An early version of Standard ML of New Jersey used a runtime 
        !          1580: stack\index{stack}, and
        !          1581: we had no end of trouble getting the stack saved correctly.  A Unix executable
        !          1582: file doesn't have a stack segment, so the stack had to be copied into the
        !          1583: heap before saving.  And after restoring and copying back to the stack
        !          1584: segment, we found that we had creamed the (new process's) command-line
        !          1585: arguments, etc.  When we eliminated the runtime stack, these annoyances
        !          1586: ceased.
        !          1587: 
        !          1588: \begin{quotation}
        !          1589: {\small 
        !          1590: \index{magic number}
        !          1591: The resulting executable file has a ZMAGIC magic number on
        !          1592: Berkeley Unix, and NMAGIC on SunOs 4.0.  This is because ZMAGIC
        !          1593: implies dynamic loading of shared libraries on SunOs 4.0, which complicates
        !          1594: the address map.  Creating executable files is annoyingly difficult
        !          1595: in operating systems like SunOs and Mach, whose notion of a process
        !          1596: address map is not as simple as in Berkeley Unix.
        !          1597: }
        !          1598: \end{quotation}
        !          1599: \section{Structured writes and reads}
        !          1600: \label{structured}
        !          1601: 
        !          1602: \index{structured write}
        !          1603: It's often necessary to take a data structure (with many records, pointers,
        !          1604: strings, etc.) and write it to a file in binary form so it can be read
        !          1605: back in quickly.  It's always possible to do this in an {\em ad hoc} way
        !          1606: for each different kind of data structure that has to be written, but
        !          1607: it's sometimes difficult to get the pointer-sharing relations right.
        !          1608: Since the runtime data format has enough information for the garbage-collector
        !          1609: to traverse the structure, then a writer/reader must be able to traverse
        !          1610: it as well.
        !          1611: 
        !          1612: In fact, we can view the saving of such a structure
        !          1613: as just a garbage collection.
        !          1614: The argument to the \verb"structured-write" function is just a root pointer,
        !          1615: and \verb"structured-write" must traverse all the data accessible from this
        !          1616: root.  In doing so, it must copy the data to a file, being careful
        !          1617: to make only one copy of each record.
        !          1618: 
        !          1619: \index{garbage collection}
        !          1620: We can, in fact, just invoke the \verb"gc" procedure, giving as the
        !          1621: \verb"roots" argument just a singleton vector---the pointer that
        !          1622: was handed to \verb"structured-write".  Then, the resulting {\em tospace}
        !          1623: will be exactly the desired contents of the file, which can be written
        !          1624: out in a single operation.
        !          1625: 
        !          1626: At this point, however, memory is in a bit of a mess; some objects have
        !          1627: been forwarded and others have not.  One solution is just to finish the
        !          1628: garbage collection: each of the ``normal'' roots of the data is
        !          1629: forwarded, and then (again) all the remaining unforwarded pointers in
        !          1630: {\em tospace} are forwarded.  Then memory is again consistent, and execution
        !          1631: may continue.
        !          1632: 
        !          1633: This is unattractive because the time required to do a structured write
        !          1634: is now proportional to the amount of all live data, not proportional
        !          1635: to the amount written.  So a different solution is better:  keep
        !          1636: a separate list of
        !          1637: repair\index{repair}
        !          1638: information about the {\em fromspace} objects that were overwritten with
        !          1639: forwarding pointers.  Whenever a {\em fromspace} object is forwarded,
        !          1640: two words of information must be kept about it:  the location of the
        !          1641: object itself, and
        !          1642: the previous value
        !          1643: of its first word (that was overwritten).  Then, repair is easy;
        !          1644: the list of repair information can be traversed, and each {\em fromspace}
        !          1645: object is restored to its original state.
        !          1646: The descriptor can be recovered by copying the descriptor of the
        !          1647: {\em tospace} copy (accessed via the forwarding pointer), and the first
        !          1648: word (which had been overwritten by the forwarding pointer) can then
        !          1649: be recovered from the repair information.
        !          1650: Saving the repair information is fast, and performing the repairs is also
        !          1651: fast.  The time to do a structured write is now proportional (with a small
        !          1652: constant of proportionality!) to the amount written.
        !          1653: 
        !          1654: There's one last problem.  Where do we keep the repair information, and
        !          1655: what if it grows too large?  The repair information can be kept at the
        !          1656: far end of {\em tospace}.  Then, as long as the amount written is fairly small,
        !          1657: we'll never run into it.  And if we do run into it, that means that
        !          1658: the amount written must be at least half the size of {\em tospace} (where {\em tospace}
        !          1659: is always at least the size of the total accessible data).  What we can do
        !          1660: in this case is just to toss away the repair information, and go ahead
        !          1661: with a garbage collection!
        !          1662: 
        !          1663: \begin{quotation}
        !          1664: {\small In a multi-generation system (as in SML-NJ's runtime system),
        !          1665: we do a (cheap) minor collection before the structured write; the collection
        !          1666: referred to in the paragraphs above is a major collection.
        !          1667: }
        !          1668: \end{quotation}
        !          1669: \section{Buffered input/output}
        !          1670: \label{buffered}
        !          1671: 
        !          1672: Some operating systems (like Unix) don't provide buffered\index{buffered I/O}
        !          1673: input and output
        !          1674: as a primitive, and even in operating systems that do, the overhead
        !          1675: of one system call per character may be too high.  What the runtime
        !          1676: system can do is provide this service for user programs.
        !          1677: 
        !          1678: The Unix/C\index{C} standard library\cite{kernighan78} demonstrates that
        !          1679: \index{standard library}
        !          1680: the runtime system need not provide this service; instead, library
        !          1681: routines can implement buffered I/O.  We have taken the same approach
        !          1682: in our system; the buffered I/O functions are implemented in ML as library
        !          1683: functions, and the runtime system is completely ignorant of buffered I/O.
        !          1684: 
        !          1685: 
        !          1686: \section{Execution profiling}
        !          1687: \label{execution}
        !          1688: 
        !          1689: \index{profiling}
        !          1690: In large software systems it can become difficult to tell where
        !          1691: the inefficiencies are.  Execution profilers can help with this process.
        !          1692: A typical profiler might provided information about the amount of time
        !          1693: spent in each procedure and the number of times it was called.
        !          1694: 
        !          1695: Standard ML of New Jersey has an execution profiler\cite{appel88:prof},
        !          1696: which will only briefly be described here.  Call counts are handled
        !          1697: completely independently of the runtime system; the compiler inserts
        !          1698: program variables that are incremented on entry to procedures, and the
        !          1699: profiler examines these variables to generate call-count information.
        !          1700: 
        !          1701: Execution-time estimates are more difficult.  The Unix \verb"prof" system
        !          1702: call starts a timer interrupt in the operating system that periodically
        !          1703: samples the program's program counter.  An affine function is applied
        !          1704: to the PC and the corresponding element of a ``histogram'' array is 
        !          1705: incremented.  Then the histogram can be compared against the object
        !          1706: file to see how many interrupts occurred in each procedure, which gives
        !          1707: a good estimate of actual time spend in each procedure.
        !          1708: 
        !          1709: With a garbage collector that moves procedures around in memory, this
        !          1710: method becomes unwieldy.  Instead, the runtime system has a variable
        !          1711: \verb"current" that points at a sample-count cell for the currently
        !          1712: executing procedure.  Whenever (profiled) code begins executing a procedure
        !          1713: (or returns from a call), it assigns the address of the new procedure's 
        !          1714: \index{current}
        !          1715: sample-count variable into \verb"current" (the compiler inserts code to do
        !          1716: this).  The timer interrupt, instead
        !          1717: of examining the PC, just increments the cell pointed to by \verb"current".
        !          1718: 
        !          1719: This very simple mechanism does not greatly complicate the runtime system,
        !          1720: but provides profiling information with a relatively low overhead.
        !          1721: 
        !          1722: \section{Debugging}
        !          1723: \label{debugging}
        !          1724: 
        !          1725: We are currently developing a high-level
        !          1726: debugger\index{debugger} based on the same principles as our
        !          1727: profiler---namely, minimal interaction with the runtime system.
        !          1728: The work is still in its early stages, but we believe that
        !          1729: it is possible to make a debugger this way, and that the runtime system
        !          1730: won't have to grow much in size to support it.  A detailed description
        !          1731: of our plans is beyond the scope of this paper.
        !          1732: 
        !          1733: \section{Handling of interrupts and exceptions}
        !          1734: \label{handling}
        !          1735: 
        !          1736: The ML language has an exception\index{exception}-handling
        !          1737:  mechanism with dynamically-nested
        !          1738: handlers.  Some exceptions are raised by the program itself, some are related
        !          1739: to synchronous hardware exceptions (like floating point errors), and
        !          1740: some are generated asynchronously (like the \verb"Interrupt" exception
        !          1741: raised when the user presses the interrupt key).
        !          1742: 
        !          1743: The ML program has a ``current-exception-handler'' register that just
        !          1744: contains a pointer to a continuation\index{continuation}
        !          1745: (i.e. a closure).
        !          1746: Raising an exception just
        !          1747: corresponds to invoking this continuation with the exception-object
        !          1748: as an argument.  Dynamic nesting of handlers is handled completely by the
        !          1749: compiler: it generates code that saves the previous handler (in the
        !          1750: new handler's closure) and restores it upon exit from the scope of the
        !          1751: new handler.
        !          1752: 
        !          1753: This makes it particularly easy for the programming language's exception
        !          1754: mechanism to be attached to the hardware's notion of fault or interrupt.
        !          1755: When a fault or interrupt\index{interrupt}
        !          1756: occurs, the operating system calls a special
        !          1757: (assembly-language)
        !          1758: function in the runtime system.  This function then determines what
        !          1759: caused the fault, clears it if necessary, then {\em does not} return to the
        !          1760: operating system but just invokes the current-exception-handler continuation
        !          1761: on the appropriate exception object.
        !          1762: 
        !          1763: \section{Fancier garbage collectors and object boundaries}
        !          1764: \label{fancier}
        !          1765: 
        !          1766: Our current system uses a good, simple generational garbage
        !          1767: collector\cite{appel89:sggc}.  Generation garbage
        !          1768: collection\cite{lieberman83}\cite{ungar84} is based on the observations
        !          1769: that newer records tend to become garbage more quickly, so that the
        !          1770: collector should concentrate its effort on the newer records; and that
        !          1771: newer records tend to point to older records, but not vice versa, so that
        !          1772: the older records need not be searched for roots into the newer area.
        !          1773: 
        !          1774: Sometimes, however, older records are modified (by a \verb"rplaca" operation
        !          1775: in Lisp or an assignment operation in ML)
        !          1776: to point to newer ones.
        !          1777: Generational garbage collectors require
        !          1778: the maintenance of a list of all modified records.
        !          1779: To ensure portability,
        !          1780: we use a very simple scheme to maintain this list: the compiler
        !          1781: inserts instructions after each \verb"update" operation to put the
        !          1782: address of the updated cell on a special list for the garbage collector
        !          1783: to use.
        !          1784: 
        !          1785: \index{page fault}\index{virtual memory}\index{garbage collector}
        !          1786: A clever trick is to use the virtual memory system in maintaining this list.
        !          1787: It is only updates to records in older generations that must be remembered;
        !          1788: we can just make all the memory occupied  by older generations read-only,
        !          1789: and an update will cause a page fault.  The fault-handler can then put the
        !          1790: page on a list of updated pages, and mark the page as writeable.
        !          1791: During garbage collection, the entire page can be searched for
        !          1792: roots into the newer region.
        !          1793: 
        !          1794: A different virtual-memory trick can be used to achieve concurrent garbage
        !          1795: collection\cite{appel88:gc}.  By making the pages of the {\em tospace} inaccessible
        !          1796: until they have been scanned and forwarded, we can allow the ML program
        !          1797: to continue executing even while the garbage collector is running.
        !          1798: An access to an unscanned page of {\em tospace} causes a page fault, and the
        !          1799: fault handler scans that page and makes it accessible.
        !          1800: Then the latency is bounded by the product of the page size and the
        !          1801: record size (in practice, to just a few milliseconds).
        !          1802: This is very desirable in a real-time or interactive system.
        !          1803: 
        !          1804: Both of these algorithms require that pages be scanned (and forwarded)
        !          1805: out of order.  When objects cross page boundaries, it is difficult
        !          1806: to know where the first object on a page begins, so that its descriptor
        !          1807: may be found.  There are solutions to this problem that involve keeping
        !          1808: track of objects that cross page boundaries\cite{appel88:gc}, but the
        !          1809: SML-NJ runtime system has a runtime data format that allows a very
        !          1810: simple solution.
        !          1811: 
        !          1812: Suppose we need to scan a particular page (to forward all its pointers).
        !          1813: When we start at the beginning of the page, we don't know whether
        !          1814: we are in the middle of a record or of a string; and we don't know
        !          1815: where the descriptor for the object is.
        !          1816: But if we knew that the page contained only records, then we wouldn't need
        !          1817: to find the descriptor;
        !          1818: the important thing about records is that their boundaries can be ignored.
        !          1819: If a page consists of the last part of one record, followed by several
        !          1820: complete records, followed by the first part of the last record, the
        !          1821: pointers in all of those records can be scanned and forwarded
        !          1822: without regard to the boundaries.  Since the descriptors are unboxed
        !          1823: (they look like integers), they won't be touched; and all the other
        !          1824: fields are tagged pointers or integers which will be forwarded (or not)
        !          1825: appropriately.  (See figure~\ref{boundaries} for an illustration.)
        !          1826: \begin{figure}[htbp]
        !          1827: \label{boundaries}
        !          1828: \begin{verbatim}
        !          1829:      +------+--object boundary
        !          1830:      | desc |
        !          1831:      +------+
        !          1832:      |  o------------>
        !          1833: -----+------+-----page boundary
        !          1834:      |  5   |
        !          1835:      +------+--object boundary
        !          1836:      | desc |
        !          1837:      +------+
        !          1838:      |  o------------>
        !          1839:      +------+
        !          1840:      |  o------------>
        !          1841:      +------+--object boundary
        !          1842:      | desc |
        !          1843:      +------+
        !          1844:      |  7   |
        !          1845:      +------+
        !          1846:      |  o------------>
        !          1847: -----+------+-----page boundary
        !          1848:      |  6   |
        !          1849:      +------+
        !          1850:      |  o------------>
        !          1851:      +------+--object boundary
        !          1852: \end{verbatim}
        !          1853: \caption{Object boundaries may be ignored by the forwarding algorithm}
        !          1854: \end{figure}
        !          1855: 
        !          1856: And if we knew that the page contained only strings, then it wouldn't
        !          1857: need to be scanned at all, since strings contain no pointers.
        !          1858: 
        !          1859: Thus, a simple arrangement that's useful when pages need to be scanned
        !          1860: in arbitrary order is
        !          1861: to segregate pointer-containing objects (records and arrays) from
        !          1862: non-pointer-containing objects (strings and byte-arrays).  This isn't necessary
        !          1863: for the newly-allocated objects; the compiler can continue to maintain
        !          1864: just one register that points at the beginning of the free space.  But
        !          1865: the copying garbage collector (the functions \verb"gc" and \verb"forward")
        !          1866: must copy records and strings into different parts of memory.
        !          1867: This greatly simplifies the implementation of page-based garbage collection
        !          1868: algorithms.
        !          1869: 
        !          1870: \section{Size of the SML-NJ runtime system}
        !          1871: \label{size}
        !          1872: 
        !          1873: Our runtime system is implemented in 1548 lines of C and 232 lines of
        !          1874: assembly language (the assembly language must be duplicated for each
        !          1875: target architecture).  This is small indeed, considering the functionality
        !          1876: of our Standard ML system.  We have made it small by moving as much as possible
        !          1877: out of the runtime system, and by having a simple format for ML data.
        !          1878: 
        !          1879: This 1780 lines of code can be divided approximately as follows:
        !          1880: \begin{enumerate}
        !          1881: \item[291] lines of C to initialize and link the Standard ML library
        !          1882: and loader, which then loads the rest of the system.
        !          1883: {\tiny (run.c)}
        !          1884: \item[299] lines to implement the copying garbage collector with repairs.
        !          1885: {\tiny (gc.c)}
        !          1886: \item[55] lines of C to implement the simulated process-switching required
        !          1887: for garbage-collector access to ML registers.
        !          1888: {\tiny (callgc.c)}
        !          1889: \item[400] lines for the management of generational garbage collection,
        !          1890: and for deciding when to ask the operating system for more memory.
        !          1891: {\tiny (callgc.c)}
        !          1892: \item[304] lines to implement C-language functions and data structures
        !          1893: used as primitives by the ML program.
        !          1894: {\tiny (cstruct.c)}
        !          1895: \item[117] lines of C to implement the suspending of process states into
        !          1896: Unix executable files.
        !          1897: {\tiny (export.c)}
        !          1898: \item[72] lines of utility functions for C functions that manipulate ML
        !          1899: objects.
        !          1900: {\tiny (objects.c)}
        !          1901: \item[38] lines of assembly language to handle simulated process switching.
        !          1902: {\tiny (*.prim.s)}
        !          1903: \item[195] lines of assembly language to implement primitive functions
        !          1904: callable by ML programs.
        !          1905: {\tiny (*.prim.s)}
        !          1906: \end{enumerate}
        !          1907: For comparison, the Icon runtime system\cite{griswold86} (excluding
        !          1908: its interpreter) is 18,000 lines of C; the T3\cite{kranz86} runtime system is
        !          1909: 1,900(?) lines of C and assembly language; the FranzLisp runtime system
        !          1910: (including interpreter)
        !          1911: is 19,500 lines of C; the Poly/ML runtime system is ??? lines of ???.
        !          1912: 
        !          1913: \section{Conclusion}
        !          1914: \label{conclusion}
        !          1915: 
        !          1916: It's easy to implement a hi-tech runtime system if you do it cleanly.
        !          1917: 
        !          1918: \section{Acknowlegements}
        !          1919: 
        !          1920: David MacQueen, Trevor Jim, and Bruce Duba have all worked on the runtime
        !          1921: system, and their assistance has been valuable.  Norman Ramsey provided
        !          1922: many useful suggestions about the organization of this paper.
        !          1923: 
        !          1924: %\cleardoublepage
        !          1925: \bibliographystyle{plain}
        !          1926: \bibliography{appel}
        !          1927: %\cleardoublepage
        !          1928: %\printindex
        !          1929: \end{document}

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.