43BSDTahoe/ucb/lisp/doc/franz.n - annotate

Return to franz.n CVS log
Up to [CSRG BSD Unix] / 43BSDTahoe / ucb / lisp / doc
Annotation of 43BSDTahoe/ucb/lisp/doc/franz.n, revision 1.1

1.1     ! root        1: .\" Copyright (c) 1980 Regents of the University of California.
        !             2: .\" All rights reserved.  The Berkeley software License Agreement
        !             3: .\" specifies the terms and conditions for redistribution.
        !             4: .\"
        !             5: .\"    @(#)franz.n     6.1 (Berkeley) 4/29/86
        !             6: .\"
        !             7: ."  franz.n                            -[Thu Jun 17 11:01:27 1982 by jkf]-
        !             8: ." this file will contain a description of the franz lisp system.
        !             9: ." sort of a systems manual.
        !            10: .de Fr
        !            11: F\s-2RANZ\s0 L\s-2ISP\s0
        !            12: ..
        !            13: .tp
        !            14: .(l C
        !            15: .sz +2
        !            16: \fBThe Franz Lisp System\fP
        !            17: .sz -2
        !            18: .sp 2v
        !            19: \fIby
        !            20: \fIJohn K. Foderaro\fP
        !            21: .)l
        !            22: .sp 3v
        !            23: .tl ''Abstract''
        !            24: .br
        !            25: This document describes the 
        !            26: .Fr
        !            27: system written at The University of California 
        !            28: at Berkeley.
        !            29: Included are descriptions of the memory management, interpreter
        !            30: and compiler, the conventions used within the C coded kernel
        !            31: and those within the compiled code.
        !            32: This does not duplicate the information found in The Franz Lisp Manual.
        !            33: .++ C '\s-2Draft\s+2'\fBThe\ Franz\ Lisp\ System\fP'%'
        !            34: .+c Characteristics\ of\ \F\s-2RANZ\ \s0L\s-2ISP\s0
        !            35: .ls 1
        !            36: .pp
        !            37: There is no formal standard for lisp systems, thus each lisp
        !            38: system is almost guaranteed to be different from any other lisp system.
        !            39: In this section we will examine the design decisions which most often
        !            40: characterize a lisp system.
        !            41: This
        !            42: focus does  not imply that all other parts of the language are generally
        !            43: agreed upon.
        !            44: In fact, there is nothing sacred to lisp system designers.
        !            45: For example, one usually identifies the symbol 
        !            46: .i nil 
        !            47: with the empty list,
        !            48: and one usually thinks of lisp as a dynamically scoped language.
        !            49: Both of these conventions are are not true in the lisp dialect 
        !            50: .i NIL
        !            51: currently being written at MIT.
        !            52: As another example, one imagines that a lisp system must 
        !            53: use  garbage collection to reclaim storage.
        !            54: The lisp dialect
        !            55: .i ZetaLisp
        !            56: that is running on Lisp Machines doesn't
        !            57: normally use a garbage collector
        !            58: because of the way in which it allocates its address space.
        !            59: It is faster to reboot the machine than to do a garbage collection.
        !            60: In most lisp dialects the lisp expressions are written in 
        !            61: parenthesised form.  In 
        !            62: .i Portable
        !            63: .i Standard
        !            64: .i Lisp
        !            65: (PSL) at the University of Utah, programs are written 
        !            66: primarily 
        !            67: in an Algol-like
        !            68: syntax.
        !            69: .pp
        !            70: Thus some of the discussion to follow indicates not so much 
        !            71: .i unique
        !            72: charateristics of 
        !            73: .Fr
        !            74: but instead, how decisions were made.
        !            75: .sh 2 Scoping\ and\ Binding
        !            76: .pp
        !            77: The 
        !            78: .Fr
        !            79: interpreter
        !            80: uses 
        !            81: .i dynamic 
        !            82: .i scoping , 
        !            83: that is, after A calls B, B can access all of the
        !            84: variables that  A allocated
        !            85: as well
        !            86: as those that A can access, provided B doesn't allocate variables of the same
        !            87: names.
        !            88: There are two popular ways of implementing dynamic scoping: 
        !            89: .i deep
        !            90: .i binding
        !            91: and 
        !            92: .i shallow
        !            93: .i binding .
        !            94: Note that we will use the terms variable allocation and lambda
        !            95: binding interchangeably in this report.
        !            96: The former term is less language-specific than the latter.
        !            97: .sh +1 Deep\ Binding
        !            98: .pp
        !            99: In a deep binding implementation, when a variable is  allocated,
        !           100: the name of the variable and a space for its value are pushed on the
        !           101: top of a stack.
        !           102: When a program asks for the value of a variable, 
        !           103: the lisp system must search down the stack for the first occurrence of
        !           104: the variable.
        !           105: This system offers great flexibility for the 
        !           106: programmer since  the state of his program 
        !           107: can be described by the contents of the stack.
        !           108: It is thus possible to save the context of a program by just saving
        !           109: the stack or in some cases just a pointer to
        !           110: a place in  the stack.
        !           111: The problem with deep binding is that it is time-consuming to search
        !           112: down the stack for the value of a variable, and, 
        !           113: as a result, most systems 
        !           114: modify the deep binding model by giving variables a global value
        !           115: cell and allowing programs to set and access the global cell.
        !           116: Some implementations of Interlisp use deep binding.
        !           117: .sh +0 Shallow\ Binding
        !           118: .pp
        !           119: In a shallow binding implementation, each variable has a global value cell
        !           120: which contains the current value of the variable.
        !           121: When a variable is allocated inside a program, its name 
        !           122: and old value are pushed on a stack called a
        !           123: .i bindstack .
        !           124: The variable is then given a new value. 
        !           125: Throughout the lifetime of the allocation, the current value of the
        !           126: variable can be found in the global value cell associated with the variable.
        !           127: When leaving the context of the variable's allocation, the old value
        !           128: is restored from the 
        !           129: .i bindstack .
        !           130: A shallow binding scheme makes it much cheaper to access 
        !           131: the values of variables, however it is much more difficult to 
        !           132: save the state and to restore it. 
        !           133: .pp
        !           134: .Fr
        !           135: and most other lisps which are not derived from Interlisp
        !           136: use shallow binding.
        !           137: Some versions of Interlisp use shallow binding.
        !           138: .sh -1 Lisp\ Compiler
        !           139: .pp
        !           140: Dynamic scoping often is not necessary and it is never cheap, even
        !           141: in a shallow binding implementation.
        !           142: Thus the 
        !           143: .Fr  
        !           144: compiler prefers to do lexical scoping rather than dynamic
        !           145: scoping.
        !           146: If the user does not specify otherwise,  when a 
        !           147: function is compiled, all variables 
        !           148: allocated will be 
        !           149: put on a local stack and will 
        !           150: .i not 
        !           151: be accessible by any
        !           152: function that this function calls.
        !           153: This
        !           154: convention results in faster code being generated by
        !           155: the compiler, but if the
        !           156: user is not careful, it can result in compiled and interpreted code
        !           157: working differently.
        !           158: In some of the new lisp designs, such as 
        !           159: .i NIL 
        !           160: and
        !           161: .i Common
        !           162: .i Lisp
        !           163: the semantics of compiled and interpreted code have been
        !           164: unified by
        !           165: transferring the compiler semantics (lexical scoping) to the interpreter.
        !           166: This is a rather large step
        !           167: if dynamic scoping is no
        !           168: longer an option, and it is not clear whether the resulting
        !           169: language should be called 'lisp'.
        !           170: .sh +0 Data\ Types
        !           171: .pp
        !           172: A complete list of data types in 
        !           173: .Fr 
        !           174: can be found in the first chapter of the 
        !           175: .Fr
        !           176: Manual.
        !           177: The most important ones are described below.
        !           178: .pp
        !           179: Lisp is a list processing language and the basic data type is the list
        !           180: cell. 
        !           181: A list cell also goes by the name of cons cell, pair, or dotted pair (dtpr).
        !           182: It contains pointers to two lisp objects, and these pointers can
        !           183: be accessed via the 
        !           184: .i car
        !           185: and
        !           186: .i cdr 
        !           187: functions.
        !           188: Each pointer requires four (8-bit) bytes and thus the list cell is 
        !           189: eight bytes large.
        !           190: The cdr pointer is stored first since this makes it easier 
        !           191: to 'cdr down a list' using the addressing modes of the VAX.
        !           192: .pp
        !           193: The symbol cell is used to implement variables.  It contains  a pointer
        !           194: to a string that is the print name, a cell to store a value, a cell to
        !           195: store the function definition, a cell to store a pointer to  the property
        !           196: list and one pointer that the list system uses if it stores
        !           197: the symbol in the oblist.
        !           198: .sh +0 Memory\ Management
        !           199: .pp
        !           200: A lisp system must be able to determine the type of an
        !           201: object at run time.
        !           202: The method used to determine the type influences the
        !           203: way storage is managed and garbage collection is done.
        !           204: Next, we will look at three possible places
        !           205: to store the type information.
        !           206: .sh +1 Typed\ data
        !           207: .pp
        !           208: The type of the data object could be placed in the object itself, much
        !           209: as Pascal stores the type of a variant record in the record itself.
        !           210: This could result is a large amount of storage being used to store
        !           211: type information.
        !           212: No lisp system that we know of uses this method exclusively.
        !           213: .sh +0 Typed\ pointers
        !           214: .pp
        !           215: Every reference to a lisp object could contain the type of the object
        !           216: referenced.
        !           217: This is a good idea in a machine like an IBM 370 where only
        !           218: part of  machine word is used by the addressing hardware.
        !           219: Lisps that use typed pointers generally use a
        !           220: .i heap
        !           221: memory management scheme and a 
        !           222: .i copying 
        !           223: garbage collector.
        !           224: In a heap scheme, 
        !           225: all memory is divided by a pointer (called the
        !           226: heap pointer) separating lisp
        !           227: data and free space.  
        !           228: When a new lisp object is needed, it is formed at the
        !           229: dividing line and then  
        !           230: the heap pointer is incremented
        !           231: past the new object.
        !           232: Garbage collection is done by maintaining two separate spaces for lisp
        !           233: data, only one of which is used at any one time.
        !           234: When one fills up, all active lisp objects are copied to the 
        !           235: other space, leaving the first space totally free. 
        !           236: Subsequent allocations are done from the second space, until it fills up,
        !           237: at which point 
        !           238: all active data in the second space is copied to the first space.
        !           239: The advantage of the copying garbage collector is that the data space
        !           240: is compacted, which will improve virtual memory behavior.
        !           241: The disadvantage is that
        !           242: half the data space is always unused.
        !           243: .pp
        !           244: PSL on a PDP-10 uses this scheme, as does Lisp 370 on 
        !           245: an IBM 370.
        !           246: PSL  and NIL on the VAX will also use this scheme.  
        !           247: Since the VAX hardware 
        !           248: uses the entire word for address 
        !           249: calculation, 
        !           250: PSL and NIL 
        !           251: must mask off the type bits
        !           252: of a pointer before using it to reference memory.
        !           253: .sh +0 Typed\ pages
        !           254: .pp
        !           255: The final alternative is to allocate data objects by pages rather than
        !           256: individually.
        !           257: Each page contains only one type of object and there is a table that 
        !           258: keeps track of the type of data object in each page.
        !           259: The address of an object tells which page the object is on and 
        !           260: the page number tells which type the object is.
        !           261: Since a whole page of objects is allocated when only one 
        !           262: object is necessary,
        !           263: the rest of the objects in the page are put on a free list.
        !           264: This method of allocation is sometimes referred to as
        !           265: .i bibop
        !           266: for BIg Bag Of Pages.
        !           267: The garbage collector for bibop is usually
        !           268: a traditional
        !           269: .i mark
        !           270: and
        !           271: .i sweep
        !           272: algorithm.
        !           273: All active data objects are marked  and
        !           274: then each page is swept linearly with
        !           275: the free cells being put on the free list and the used cells 
        !           276: left alone.
        !           277: The advantage of mark and sweep over a copying garbage collector is that
        !           278: the mark phase is cheaper because data objects do not have to be copied.
        !           279: Also, all of memory can be used since there is no requirement for two spaces.
        !           280: The disadvantage is that a sweep phase is required in a mark and sweep
        !           281: but one is not required in a copying garbage collector.
        !           282: The sweep phase is expensive because it has to alter most data pages 
        !           283: while building a free list.
        !           284: This operation 
        !           285: can be expensive in a virtual memory environment.  
        !           286: Alternatives to the sweep phase have been proposed in
        !           287: [Foderaro+Fateman Characterization of Vax Macsyma].
        !           288: .pp
        !           289: .Fr
        !           290: uses a bibop memory allocator and a mark and sweep garbage collector,
        !           291: as does Maclisp (on a PDP-10).
        !           292: The reason that
        !           293: .Fr 
        !           294: uses bibop is primarily due to the VAX architecture,
        !           295: which makes typed pointers
        !           296: expensive to implement.
        !           297: Also, typed pointers would make it difficult to pass lisp values
        !           298: to  programs written in other languages, some of which may not have
        !           299: the ability to extract the address of a lisp object from a typed pointer.
        !           300: .sh -1 Construction
        !           301: .pp
        !           302: The
        !           303: .Fr
        !           304: system is built by adding lisp code to a C-coded kernel.
        !           305: The C-coded kernel is a working lisp interpreter, although it has few of
        !           306: the functions a lisp system needs.
        !           307: The lisp code provides most of the functions, the top level and error
        !           308: handlers.
        !           309: The lisp compiler is also written in lisp, 
        !           310: but is usually 
        !           311: run as a separate process.
        !           312: .sh +0 Unique\ features
        !           313: .pp
        !           314: .Fr
        !           315: can dynamically load and execute functions defined in the other languages
        !           316: which run on the VAX.
        !           317: It uses two different dynamic loaders.
        !           318: One is part of the lisp system and is used for lisp only, it is called 
        !           319: .i fasl
        !           320: (for fast loader).
        !           321: The other is the Unix loader, ld, which is used for loading in 
        !           322: C, Fortran or Pascal code.
        !           323: Loading code with fasl
        !           324: is much faster than with ld
        !           325: since fasl benefits from 
        !           326: the simplicity of a lisp object file.
        !           327: .+c Data\ Structures
        !           328: .sh 0 _
        !           329: .nr $1 \n(ch
        !           330: .pp
        !           331: In this chapter we shall examine the data structures which are most 
        !           332: important to the 
        !           333: .Fr
        !           334: system.
        !           335: First, though we will see how the lisp system uses the address space.
        !           336: .sh 2 Physical\ layout
        !           337: .pp
        !           338: As a Unix process, lisp has three address regions: text, data and stack.
        !           339: Text begins at location 0 and continues up to 0x3b73c (0x means hex number).
        !           340: The data segment begins on the next page boundary and continues
        !           341: up to a limit set by the configuration parameters (currently 0x2fd000).
        !           342: Lisp does not begin running with such a large data segment,  rather it grows
        !           343: when necessary.
        !           344: The stack segment begins at address 0x7fffffff and grows downward to a 
        !           345: maximum size of one-half megabyte.
        !           346: .pp
        !           347: The text segment stores the instructions for the C kernel as well as read-only
        !           348: data.
        !           349: The read-only data for lisp are 
        !           350: the symbol nil, the i/o ports, and the small integer
        !           351: table.
        !           352: The symbol nil is stored at location 0 which makes it very easy to test
        !           353: whether a value is nil.
        !           354: The problem with storing nil in read-only space is that a special case
        !           355: must be made for nil's property list, which is the only thing in the
        !           356: nil symbol that the user is permitted to alter.
        !           357: .pp
        !           358: The fixnums -1024 through 1023 are stored sequentially, with 0 being
        !           359: stored at 0x1400.
        !           360: .Fr
        !           361: doesn't have any 'immediate' lisp data, that is data whose value is 
        !           362: stored as part of the reference to the data. 
        !           363: But, by storing some of the fixnums in a known place, we can achieve
        !           364: some of the benefits of immediate data: 
        !           365: A program  can use 
        !           366: .i eq
        !           367: as a test for a fixnum in the range -1024 to 1023.
        !           368: In the majority of cases, when asked
        !           369: to allocate a fixnum, the system can  return a pointer into this table
        !           370: and bypass the memory allocator.
        !           371: .sh +0 Typetable
        !           372: .pp
        !           373: .Fr
        !           374: uses the typed pages (or bibop)
        !           375: method of type determination.
        !           376: The 
        !           377: .i typetable
        !           378: is an array of bytes (
        !           379: .i chars 
        !           380: in  C lingo).
        !           381: This table describes the type of all pages, from page 0 where nil is stored,
        !           382: up to the end of data space.
        !           383: Thus there are many entries that describe the types of the pages which
        !           384: make up the C kernel.
        !           385: In order to reduce the wasted space in the typetable,
        !           386: we could have made the typetable begin typing pages at the start of data
        !           387: space and  make a special case of the pages that contain nil and
        !           388: the small integer table.
        !           389: However, the
        !           390: effect of this change would be that type checks 
        !           391: (which are done in-line in compiled code) would
        !           392: be longer and slower.  Since type checking happens quite
        !           393: frequently, we felt it was better to waste the space in the
        !           394: typetable in order to save execution time  and instruction space.
        !           395: .pp
        !           396: Each page on a VAX is 512 bytes, and thus to find the type of an 
        !           397: object given the address of it, 
        !           398: it is only necessary to shift the address right 9 bits
        !           399: and index the typetable array offset by one.
        !           400: The offset by one is required because the value -4, which is an illegal
        !           401: address, is used as a sentinel value to indicate an illegal value.
        !           402: Thus when we take the type of -4 we will be indexing the table by -1
        !           403: and we want to retrieve the first byte in the table (which contains
        !           404: the value UNBOUND).
        !           405: The C macro which retrieves the type is this (from file h/global.h):
        !           406: .(b I
        !           407: 
        !           408: #define        TYPE(a1)        ((typetable+1)[(int)(a1) >> 9])
        !           409: 
        !           410: .)b
        !           411: This is code generated by the lisp compiler to check if the type
        !           412: code of an argument (stored at 0(r10)) is a symbol (which is type
        !           413: code 1):
        !           414: .(b I
        !           415: 
        !           416: ashl   $-9,0(r10),r0
        !           417: cmpb   _typetable+1[r0],$1
        !           418: 
        !           419: .)b
        !           420: .pp
        !           421: The type codes which are stored in the typetable are these:
        !           422: .ts 2i 4i 6i
        !           423: .(b I
        !           424: UNBO     -1\tSTRNG     0\tATOM     1\tINT       2
        !           425: DTPR      3\tDOUB      4\tBCD      5\tPORT      6
        !           426: ARRAY     7\tOTHER     8\tSDOT     9\tVALUE    10
        !           427: HUNK2    11\tHUNK4    12\tHUNK8   13\tHUNK16   14
        !           428: HUNK32   15\tHUNK64   16\tHUNK128 17
        !           429: .)b
        !           430: The names given above are the C kernel internal names.
        !           431: ATOM is symbol, INT is fixnum, DTPR is list cell, 
        !           432: DOUB is flonum, BCD is binary,
        !           433: SDOT is bignum and all the HUNKn types are just known as hunk to
        !           434: the user.
        !           435: .sh +0 Stacks
        !           436: .pp
        !           437: .Fr
        !           438: uses three stacks: the 
        !           439: .i C 
        !           440: .i runtime 
        !           441: stack, the 
        !           442: .i namestack
        !           443: and the
        !           444: .i bindstack .
        !           445: The C runtime stack is the stack part of the address space 
        !           446: and the other two stacks are stored in the data space.
        !           447: .sh +1 C\ runtime\ stack
        !           448: The C-coded kernel uses this
        !           449: stack in the same way as a typical  C program
        !           450: use the stack:
        !           451: storing return addresses
        !           452: and non-lisp-object arguments to  subroutines,
        !           453: saving registers,
        !           454: and storing  local variables within a function.
        !           455: The lisp system stores 
        !           456: .i catch
        !           457: .i frames
        !           458: on this stack as well (to be described later).
        !           459: .pp
        !           460: The lisp system assumes that there are no lisp data on the stack and
        !           461: thus  the use of this stack 
        !           462: by a program is unrestricted.
        !           463: As will be discussed later on, the 
        !           464: .b calls
        !           465: and 
        !           466: .b ret
        !           467: instructions are used for most subroutine calls and returns.
        !           468: These instructions expect the stack to look a certain way.
        !           469: .sh +0 Namestack
        !           470: .pp
        !           471: The namestack contains only lisp data.
        !           472: It is used to pass arguments to functions and to  hold
        !           473: local (lexically scoped) data within lisp functions.
        !           474: It is also used as a temporary storage spot for lisp data
        !           475: which must be protected from garbage collection.
        !           476: .pp
        !           477: A slight digression on the garbage collector:
        !           478: The person who writes code for the lisp system must always be aware of
        !           479: his greatest enemy: the garbage collector.  
        !           480: Whenever a function is called that could possibly allocate more lisp
        !           481: data, one must assume that it 
        !           482: when it attempts to allocate space,  the garbage collector
        !           483: will be invoked and that it will take away everything that isn't protected
        !           484: from garbage collection.
        !           485: The objects that are protected from garbage collection are: the
        !           486: namestack, the bindstack, the oblist (table of interned symbols),
        !           487: and the compiler literals.  Objects that are only 
        !           488: referred to by values in registers or
        !           489: the C runtime stack will not be seen by the mark phase of the
        !           490: garbage collector and will be swept up during the sweep phase.
        !           491: .pp
        !           492: Back to the namestack:
        !           493: The first free element on the namestack is pointed to by the
        !           494: C variable 
        !           495: .i np .
        !           496: This variable is always stored in the VAX register r6.
        !           497: Another pointer into the namestack is the C variable
        !           498: .i lbot .
        !           499: It is always stored in VAX register r7.
        !           500: Its use will be described in the section on calling conventions.
        !           501: .sh +0 Bindstack
        !           502: .pp
        !           503: The bindstack is a stack of lisp object pairs: symbol and
        !           504: value.
        !           505: It is used to implement shallow binding.
        !           506: When a symbol is lambda-bound, the symbol and its current value are put
        !           507: on this stack.
        !           508: Then the symbol can be given a new value.
        !           509: When the context of the lambda binding is over, the symbol and value pair
        !           510: are popped from the stack and the symbol is given its old value.
        !           511: The C variable
        !           512: .i bnp
        !           513: points to the first free entry on the bindstack.
        !           514: In the C code, the following macros lambda-bind a symbol to a value
        !           515: and restore the old value:
        !           516: .(b
        !           517: #define PUSHDOWN(atom,value)\e
        !           518:        {bnp->atm=(atom);\e
        !           519:          bnp++->val=(atom)->a.clb;\e
        !           520:         (atom)->a.clb=value;\e
        !           521:        if(bnp>bnplim) binderr();}
        !           522: 
        !           523: #define POP\e
        !           524:        {--bnp;  bnp->atm->a.clb=bnp->val;}
        !           525: .)b
        !           526: .sh -1 Bitmap
        !           527: .pp
        !           528: The bitmap is used in garbage collection to hold
        !           529: the mark bits for the mark and sweep garbage collector.
        !           530: As its names implies, it is viewed as a collection of bits.
        !           531: The minimum size of a lisp data object is 4 bytes, thus
        !           532: there are 128 of those on a VAX page of 512 bytes.
        !           533: Since there are 8 bits in a byte, it takes 16 bytes to obtain 128 bits.
        !           534: Therefore the size of the bitmap in bytes is 16 
        !           535: times the maximum number of
        !           536: pages.
        !           537: Like the typetable, the bitmap  keeps track of marks from the
        !           538: bottom of memory, not the bottom of data space.
        !           539: The bitmap and the typetable are static structures.
        !           540: It is their size, which is determined when the C kernel is built,
        !           541: which determines the size to which the data segment can grow.
        !           542: .sh +0 Transfer\ Tables
        !           543: .pp
        !           544: Transfer tables are used by compiled lisp code as a transfer vector to
        !           545: other functions.
        !           546: A transfer table consists of pairs of entries: symbol and pointer to
        !           547: function.
        !           548: When a compiled lisp function  calls another (non-local) function, it
        !           549: calls indirectly through an entry in the transfer table.
        !           550: Depending on the setting of certain status variables, the call may
        !           551: bring control into a function linkage routine or directly to
        !           552: the function desired (if that function is a compiled lisp or C function).
        !           553: .pp
        !           554: Transfer tables serve a number of useful purposes.
        !           555: .np
        !           556: They allow compiled code to call interpreted code
        !           557: or compiled code using the same calling sequence.
        !           558: .np
        !           559: They allow the selection of which function to call to be determined
        !           560: at runtime based on the name of the function.
        !           561: In most other languages, this selection process is done at either
        !           562: compile or load time.
        !           563: .np
        !           564: They allow the user to run in a debugging mode where all calls
        !           565: from compiled code go through the interpreter.
        !           566: Once control reaches the interpreter, it is easier to debug.
        !           567: .np
        !           568: They allow the user to run in a production mode, where all calls
        !           569: from compiled to other compiled code are done without function
        !           570: lookup overhead.
        !           571: .np
        !           572: They allow the user to switch between debugging and production modes
        !           573: at runtime with one function call.
        !           574: .pp
        !           575: Transfer tables will be described further  in the section on 
        !           576: the lisp compiler.
        !           577: .sh +0 Catch\ frames
        !           578: .pp
        !           579: Lisp has a number of forms of non-local transfers.
        !           580: Among them are 
        !           581: .i throw ,
        !           582: .i error ,
        !           583: .i return 
        !           584: and
        !           585: .i go .
        !           586: If a program is willing to accept a non-local transfer, it puts a 
        !           587: .i catch
        !           588: .i frame
        !           589: on the stack which describes which type of transfer it 
        !           590: accepts.
        !           591: The catch frame describes the current state of the registers,
        !           592: the variables np, lbot, and bnp, and also contains entries that describe
        !           593: what kinds of non-local transfers the function will accept.
        !           594: After creating a catch frame, the program  goes on to evaluate forms.
        !           595: Should the evaluation of one of those forms result in a non-local
        !           596: transfer to the catch frame that this program has set up, the 
        !           597: system will restore the namestack and bindstack to the way
        !           598: they were when the program put the catch frame on the stack (by using
        !           599: np and bnp) and will return control to the program (setting 
        !           600: the variable retval to describe why it returned).
        !           601: This is described further in the 
        !           602: section on interpreter conventions.
        !           603: .pp
        !           604: The C variable 
        !           605: .i errp
        !           606: points to the most recent catch frame, and then each catch frame points
        !           607: to the previous catch frame.
        !           608: .sh +0 oblist
        !           609: .pp
        !           610: Normally when symbols are created they are
        !           611: .i interned ,
        !           612: that is they are put in a hash table called an oblist (or obarray).
        !           613: The purpose of the oblist is to insure that if you type a
        !           614: symbol's name to the reader, you will always get  the  same symbol.
        !           615: Also it protects the symbol from garbage collection.
        !           616: The oblist is simply a hash table with buckets, with a hash link being
        !           617: part of the symbol data structure.
        !           618: Currently the hash table is not accessible to a lisp program, but with
        !           619: a little work it could be.
        !           620: .+c Interpreter
        !           621: .sh 0 _
        !           622: .nr $1 \n(ch
        !           623: .pp
        !           624: The interpreter is composed of these parts:
        !           625: .ip \fIcore:\fP
        !           626: The functions eval, apply and funcall.
        !           627: .ip \fIstack\ management:\fP
        !           628: The code to create catch frames and handle non-local transfers.
        !           629: .ip \fImemory\ management:\fP
        !           630: The code to allocate and garbage collect memory.
        !           631: .ip \fIthe\ functions:\fP
        !           632: The user callable functions that expect lisp arguments and return
        !           633: lisp values.
        !           634: .pp
        !           635: In the above list, the first three are written mainly in C (with a little
        !           636: assembler) and the last is written mainly in Lisp with a little 
        !           637: bit in C
        !           638: and a very small amount in assembler.
        !           639: .pp
        !           640: The core functions are the center of activity  in the interpreter.
        !           641: The 
        !           642: .i eval 
        !           643: function given a lisp expression  to evaluate.  
        !           644: It must determine if it is a simple expression such as a symbol or number
        !           645: whose value eval can determine immediately, or if it is an function
        !           646: calling expression.
        !           647: If the form is a function call, eval must determine if the arguments should
        !           648: be evaluated and if so eval must recursively call itself to evaluate the
        !           649: arguments and then stack them.
        !           650: If the function called is to be interpreted, eval must lambda-bind the
        !           651: formal variables of the function to the arguments just stacked.
        !           652: If the function being called is compiled, eval just calls the function and
        !           653: lets the function do the lambda binding if it wants to.
        !           654: .pp
        !           655: The 
        !           656: .i apply 
        !           657: function is passed two lisp objects: 
        !           658: a function to call (either a symbol or a functional object)
        !           659: and a list of arguments already evaluated.
        !           660: It will do lambda binding if the function to call is to
        !           661: be interpreted.
        !           662: .pp
        !           663: The 
        !           664: .i funcall
        !           665: function is passed any number of lisp objects,
        !           666: the first of which is the function to call, and the rest are the
        !           667: arguments to the function 
        !           668: which have already been evaluated and stacked.
        !           669: Funcall will  do lambda binding if the function to call is
        !           670: to be interpreted.
        !           671: .pp
        !           672: When compiled lisp code calls a function 
        !           673: which must be interpreted, it enters the interpreter through the
        !           674: funcall function.
        !           675: The interpreter may go to compiled code through either eval, apply or
        !           676: funcall, though most often it goes through eval.
        !           677: .sh 2 Conventions
        !           678: .pp
        !           679: These are the conventions used in interpreted and compiled code.
        !           680: .sh +1 C\ conventions
        !           681: .pp
        !           682: Our conventions are extensions of the C calling conventions, so we
        !           683: describe here the conventions for C.
        !           684: The VAX has 16 general
        !           685: purpose registers.
        !           686: Registers r12 through r15 are reserved
        !           687: for use by the hardware because we use the
        !           688: .i calls
        !           689: subroutine call.
        !           690: Registers r0-r5 can be used by a program without saving them first.
        !           691: The result of a function call is returned in r0.
        !           692: Registers r11-r6 are allocated  (from high to low)
        !           693: by the C compiler when a 
        !           694: .i register
        !           695: declaration is made in the C code.
        !           696: Registers r11-r6 must be
        !           697: saved upon entry and restored upon exit if they are used within a function.
        !           698: On the VAX it is very easy to preserve registers
        !           699: since it is done automatically
        !           700: by the hardware if the
        !           701: .i calls
        !           702: (or 
        !           703: .i callg ) 
        !           704: instruction is used.
        !           705: The first short word (two bytes) of a subroutine is a 
        !           706: register save mask which
        !           707: tells which registers should be saved (on the C runtime stack) upon
        !           708: entry and restored when a 
        !           709: .i ret
        !           710: instruction is done.
        !           711: .sh +0 np\ and\ lbot
        !           712: .pp
        !           713: Earlier we  mentioned that the C variables np and lbot are 
        !           714: always stored in
        !           715: registers r6 and r7.
        !           716: It is very difficult to globally assign a variable to a register
        !           717: in the C language (no external register declarations
        !           718: are permitted)
        !           719: and thus we have to filter
        !           720: the output of the  C compiler and convert all occurrences of 'np' to 'r6'
        !           721: and all occurrences of 'lbot' to 'r7'.
        !           722: This is only half the job, though.
        !           723: We also must modify  the register save masks for those routines which 
        !           724: alter the value of np or lbot to insure that those registers get 
        !           725: saved and restored upon function entry and exit.
        !           726: This is done in the C code by putting 
        !           727: .(b C
        !           728: Savestack(n)
        !           729: .)b
        !           730: at the beginning of a routine which will alter np or lbot and which
        !           731: also allocates n register variables.
        !           732: Also in that routine, before the function returns, we put 
        !           733: .(b C
        !           734: Restorestack()
        !           735: .)b
        !           736: This is not really necessary in the VAX, but it is there for other 
        !           737: machine implementations which don't have a 
        !           738: .i ret
        !           739: function which restores registers.
        !           740: The calls to Savestack are recognized by a filter which processes
        !           741: the output
        !           742: of the C compiler and fixes up 
        !           743: the save masks.
        !           744: .sh +0 Function\ calling
        !           745: .pp
        !           746: The following text describes what the conventions are for 
        !           747: calling
        !           748: compiled lisp functions, whether they were written in lisp or C.
        !           749: We look at it from the viewpoint of the called function.
        !           750: .pp
        !           751: Upon entry
        !           752: to the compiled functon,
        !           753: lbot points to the first argument and np points to the
        !           754: first free spot on the namestack.
        !           755: If np equals lbot then there are no arguments.
        !           756: Recall that np will be in r6 and lbot in r7.
        !           757: The function is free to alter registers r0 through r5 and should return
        !           758: the result in r0.
        !           759: The function may alter registers r6 through r11 as long as their 
        !           760: original values
        !           761: are restored when the function returns.
        !           762: The value of np should always 
        !           763: point to the first free entry in the namestack.
        !           764: This is all that is required.
        !           765: The extra conventions followed by 
        !           766: the lisp compiler 
        !           767: in the code it generates are described in the next chapter.
        !           768: .sh +0 Catch\ frames
        !           769: .pp
        !           770: A catch frame saves the state of execution that a program
        !           771: might  want to return to at some later time.
        !           772: A catch frame  is stored on the C runtime stack, with the most recent 
        !           773: frame pointed to by the C global variable errp.
        !           774: The information saved is 
        !           775: .ip \fIargs\fP
        !           776: - one or two optional arguments. The lisp compiler always 
        !           777: stacks two 
        !           778: arguments since it must know exactly how large the frame is.
        !           779: .ip \fIclass\fP
        !           780: - the class describes what type of frame this is (described below).
        !           781: .ip \fIretaddr\fP
        !           782: - address to return to if returning from this frame
        !           783: .ip \fIolderrp\fP
        !           784: - pointer to next older catch frame on the stack
        !           785: .ip \fIbnp\fP
        !           786: - value of bnp (bindstack pointer) when the frame was created
        !           787: .ip \fInp\fP
        !           788: - value of np when the frame was created
        !           789: .ip \fIlbot\fP
        !           790: - value of lbot when the frame was created. 
        !           791: .ip \fIsystem\ dependent\fP
        !           792: - the rest of the information stacked depends on the particular machine.
        !           793: In the case of the VAX, registers r13 through r8 are stacked. (r14 and
        !           794: r15 are the stack pointer and program counter; they are not saved since
        !           795: they can be calculated from the other information).
        !           796: .pp
        !           797: The information in a catch frame is put on the C runtime stack
        !           798: in the precise order given above, and the variable errp points not
        !           799: at the beginning of the entire frame, but to the lbot entry.
        !           800: Thus errp\ +\ 4 points to np.
        !           801: The classes of frames are described below.
        !           802: Each class is defined as a constant in the C code (h/frame.h) whose value
        !           803: is a small integer. 
        !           804: .ip F_PROG\ [1]
        !           805: - takes no arguments.  This frame marks the beginning of a piece of
        !           806: code which can accept a 
        !           807: .i return
        !           808: or a 
        !           809: .i go .
        !           810: The interpreter uses this to implement
        !           811: .i prog
        !           812: and 
        !           813: .i do
        !           814: and for its primative break loop.
        !           815: The lisp compiler does not use catch frames for progs since it only
        !           816: permits 
        !           817: .i returns
        !           818: and
        !           819: .i gos
        !           820: to occur within
        !           821: .i progs 
        !           822: or 
        !           823: .i dos
        !           824: and thus it can determine how to do the non-local transfer
        !           825: at compile time.
        !           826: .ip F_CATCH\ [2]
        !           827: - this takes one or two arguments and is used to implement both
        !           828: .i catch
        !           829: and 
        !           830: .i errset .
        !           831: In both cases 
        !           832: the first argument is the tag which is being caught,
        !           833: which in the case of an 
        !           834: .i errset
        !           835: is symbol ER%all.
        !           836: An 
        !           837: .i errset
        !           838: also
        !           839: supplies a second argument which is non nil if the error message
        !           840: should be printed.
        !           841: .ip F_RESET\ [3]
        !           842: - in the C kernel, the reset function 
        !           843: is implemented as a non local transfer to a F_RESET catchframe.
        !           844: When the lisp system is built, the reset function is redefined to
        !           845: do a 
        !           846: .i throw.
        !           847: Thus this type of catch frame is rarely used.
        !           848: .ip F_EVAL\ [4]
        !           849: - this has one argument, the form being evaluated. 
        !           850: Since stacking this
        !           851: on every eval would be expensive,
        !           852: this type of catch frame is only stacked if a \fI(*rset\ t)\fP
        !           853: has been done and if the value of the symbol 
        !           854: .i *rset
        !           855: is non nil.
        !           856: The form being evaluated is stacked  so that 
        !           857: the necessary information for the
        !           858: .i evalframe
        !           859: function is available and so that 
        !           860: .i freturn
        !           861: can return from the context of any pending evaluation.
        !           862: .ip F_FUNCALL\ [5]
        !           863: - this is much like F_EVAL, 
        !           864: except the one argument it takes is the
        !           865: name of the function to call.  
        !           866: It has the same restrictions on when it is stacked as  F_EVAL
        !           867: and for the same reasons.
        !           868: .pp
        !           869: In C, a frame is pushed on the stack with Pushframe, with a call
        !           870: of one of these forms:
        !           871: .(b
        !           872: errp = Pushframe(class);
        !           873: errp = Pushframe(class,arg1);
        !           874: errp = Pushframe(class,arg1,arg2);
        !           875: .)b
        !           876: After the call the C variable 
        !           877: .i retval
        !           878: contains the value which describes why this function returned.
        !           879: You must remember that it is possible for this one function call to
        !           880: return more than once! 
        !           881: The reasons it can return are these (from h/frame.h):
        !           882: .ip C_INITIAL\ [0]
        !           883: This is the initial call to set up the frame.
        !           884: .ip C_GO\ [1]
        !           885: This will only happen for F_PROG frames. 
        !           886: In this case, 
        !           887: the C variable lispretval contains a lisp symbol which is the tag
        !           888: to which control should be transferred.
        !           889: If the tag cannot be found in this prog or do body, the
        !           890: tag should be again thrown up the stack to a next higher prog or do.
        !           891: .ip C_RET\ [2]
        !           892: This will only happen for F_PROG frames.
        !           893: In this case, lispretval contains the value to return from this prog.
        !           894: .ip C_THROW\ [3]
        !           895: This will only happen for F_CATCH frames.
        !           896: In this case lispretval contains the value to return from this catch.
        !           897: .ip C_RESET\ [4]
        !           898: This will only happen for F_RESET frames.
        !           899: It simply means that a reset is being done.
        !           900: .ip C_FRETURN\ [5]
        !           901: This will only happen for F_EVAL and F_FRETURN frames.
        !           902: The variable lispretval contains the value to return from this
        !           903: call to 
        !           904: .i eval
        !           905: or
        !           906: .i funcall .
        !           907: .pp
        !           908: The call to Pushframe is turned into a simple subroutine call (using
        !           909: the 
        !           910: .i jsb
        !           911: instruction instead of the
        !           912: .i calls )
        !           913: by the filters which alter the code coming out of the C compiler.
        !           914: Thus it may be useful to see what stacking a catch frame really looks
        !           915: like.  
        !           916: Here is what the lisp compiler generates 
        !           917: to stack the frame for \fI(*catch\ 'tag\ x)\fP
        !           918: .(b
        !           919: movl   0(r8),r0        #move 'tag to r0
        !           920: pushl  $0              # dummy argument
        !           921: pushl  r0              # put tag argument on C runtime stack
        !           922: pushl  $2              # push F_CATCH
        !           923: jsb    _qpushframe     # call Pushframe
        !           924: movl   r0,_errp        # move result into errp
        !           925: .)b
        !           926: .pp
        !           927: Every function which does a Pushframe() must also do a Popframe()
        !           928: before it returns to its caller.
        !           929: This simply removes the frame from the stack.
        !           930: In C it is written:
        !           931: .br
        !           932: .tl ''errp = Popframe()''
        !           933: in the code generated by the lisp compiler it looks like:
        !           934: .(b
        !           935: movl   _errp,r1        # put current errp in r1
        !           936: movl   12(r1),_errp    # put previous errp in errp
        !           937: addl3  $32,r1,sp       # pop frame from stack
        !           938: .)b
        !           939: .pp
        !           940: Non-local transfers are done with the Inonlocalgo
        !           941: function. This function always takes three arguments. 
        !           942: The
        !           943: first is the return type (one of the types mentioned
        !           944: above which begin with 'C_'). It will be assigned to retval.
        !           945: The second argument is the value to be assigned to lispretval,
        !           946: except in the case of a C_THROW, where the second argument
        !           947: is the tag to throw to and the third argument is the value
        !           948: to assign to lispretval.
        !           949: This function never returns. 
        !           950: If it doesn't find a catch frame
        !           951: which matches what it is looking for, it signals an error.
        !           952: The function is called with 
        !           953: .i calls
        !           954: and the arguments are stacked on the C runtime stack, not the 
        !           955: namestack.
        !           956: .+c Liszt:\ The\ Lisp\ Compiler
        !           957: .sh 0 _
        !           958: .nr $1 \n(ch
        !           959: .pp
        !           960: The purpose of compiling a lisp function is to create a representation of 
        !           961: the function which can be evaluated in less time and perhaps take
        !           962: up less space.
        !           963: There are two approaches to lisp compilers.
        !           964: One is to convert the functions to a compact form, often called 
        !           965: .i bytecodes
        !           966: which can be rapidly interpreted.
        !           967: Each bytecode represents a primitive operation in the lisp system.
        !           968: This approach is simple to implement but there is an time penalty
        !           969: in using an  interpreter.
        !           970: The other approach is to compiled to  machine code.
        !           971: In general, this is not as portable as the bytecode  approach but the result
        !           972: generally runs faster.
        !           973: There are two ways of compiling to  machine code. 
        !           974: One is to place arguments to functions in registers. 
        !           975: If there are more arguments than registers, the arguments are put on
        !           976: a stack and a special type of call is made.
        !           977: This method is used in Maclisp.
        !           978: The other method is to assume a stack model, in which
        !           979: arguments to a function are placed on a stack.
        !           980: This is what the 
        !           981: .Fr
        !           982: compiler Liszt does.
        !           983: The stack model made it much easier to write parts of the lisp
        !           984: system in the C langauge.
        !           985: It also simplifies the garbage collector since the mark phase doesn't
        !           986: have to locate and peruse all saved registers looking for lisp data.
        !           987: .sh 2 File\ Compiler
        !           988: .pp
        !           989: When a file of lisp expressions is loaded,
        !           990: the 
        !           991: .i load 
        !           992: function repeatedly reads and evaluates the forms in the
        !           993: file.
        !           994: Some of these evaluations may result in functions being defined, 
        !           995: and others may use the functions just defined or previously defined.
        !           996: The job of the lisp compiler is to create an object file, which,
        !           997: when read in 
        !           998: by 
        !           999: .i fasl,
        !          1000: acts just like it was 
        !          1001: .i load ed
        !          1002: in, except when a function is defined it is defined in machine
        !          1003: code, not as a lisp expression.
        !          1004: This is quite a bit different from what compilers do in other languages
        !          1005: and it is done this way to make it easier to 
        !          1006: switch between compiled and
        !          1007: interpreted code.
        !          1008: .sh +0 Differences\ between\ compiled\ and\ interpreted
        !          1009: .pp
        !          1010: There are some differences, though, between compiled and interpreted code.
        !          1011: Local variables in compiled code are lexically scoped (put on the
        !          1012: namestack and inaccessible to other functions) unless the variable
        !          1013: has been declared 
        !          1014: .i special.
        !          1015: The declaration:
        !          1016: .(b
        !          1017: \fI(declare (special x y))\fP
        !          1018: .)b
        !          1019: declares both x and y to be special from this point in the file on.
        !          1020: The declaration 
        !          1021: .(b
        !          1022: \fI(declare (specials t))\fP
        !          1023: .)b
        !          1024: declares all variables to be special.
        !          1025: .pp
        !          1026: Lisp has a very powerful macro definition system.
        !          1027: The compiler will macro expand all it can, whereas the interpreter
        !          1028: expands macros when necessary but never replaces a macro call with
        !          1029: its expansion.
        !          1030: Thus if you redefine a macro, the compiled code that uses it will
        !          1031: still work the same way, but the interpreted code will use the 
        !          1032: new definition. 
        !          1033: Also, when compiling a file, macro definitions must be
        !          1034: done before any call on the macro  is encountered during compiling.
        !          1035: In the interpreter, 
        !          1036: macros can be defined or redefined anytime up until
        !          1037: they are used.
        !          1038: Thus the interpreter is far freer about macro definitions than 
        !          1039: the compiler.
        !          1040: This could cause programs which work interpreted to
        !          1041: fail compiled.
        !          1042: .sh +0 Top\ level\ algorithm
        !          1043: .pp
        !          1044: The top level algorithm of the compiler is simply this: 
        !          1045: .np 
        !          1046: read a lisp expression from the input file
        !          1047: .np 
        !          1048: macro expand the top level of the form as much as possible
        !          1049: .np 
        !          1050: if the form is a function definition (a list beginning with
        !          1051: the symbol 'def')
        !          1052: then compile it.
        !          1053: .np 
        !          1054: if the form is not a function definition then put it on a list of 
        !          1055: forms that will be evaluated when the file is 
        !          1056: .i fasl ed
        !          1057: in.
        !          1058: .np
        !          1059: if not at end of file, go to step 1.
        !          1060: .pp
        !          1061: In reality, step 3 is a bit more complex.
        !          1062: If the definition is of a macro, then the form will be evaluated
        !          1063: immediately, thus adding the macro definition to the compiler's 
        !          1064: environment.
        !          1065: If the lisp variable 
        !          1066: .i macros 
        !          1067: is t then the macro will also be compiled.
        !          1068: There are also some forms like \fI(progn\ 'compile\ ...)\fP
        !          1069: and \fI(eval-when\ ()\ )\fP which determine when the 
        !          1070: forms get compiled and evaluated.  
        !          1071: See the lisp manual for details.
        !          1072: .sh +0 Expression\ Compilation
        !          1073: .pp
        !          1074: Just as the interpreter is centered around the 
        !          1075: .i eval
        !          1076: function, the lisp compiler is centered around the function
        !          1077: .i d-exp
        !          1078: whose job it is to compile the expression passed to it.
        !          1079: The lisp compiler is one pass.
        !          1080: Before a call to d-exp returns,
        !          1081: all the compiled code for a form has been computed and written
        !          1082: to a file.
        !          1083: One reason that the lisp compiler can be a single pass compiler
        !          1084: is that the assembler which reads the compiler's output is 
        !          1085: a two pass assembler.
        !          1086: .sh +1 global\ state\ variables
        !          1087: .pp
        !          1088: There are a number of variables that maintain the state of
        !          1089: the compilation process.
        !          1090: These variables are declared special and are thus dynamically scoped
        !          1091: and visible to any function within the compiler.
        !          1092: When d-exp 
        !          1093: is called their meanings are:
        !          1094: .ip \fIv-form\fP
        !          1095: - contains the expression to be compiled.
        !          1096: .ip \fIg-loc\fP
        !          1097: - tells where the result of evaluating this expression should be put.
        !          1098: If g-loc is nil, then the value returned is unimportant and shouldn't
        !          1099: be put anywhere.
        !          1100: .ip \fIg-cc\fP
        !          1101: - controls conditional branches.
        !          1102: If g-cc is non nil, then it is a list cell whose value has
        !          1103: either a non-null car or non-null cdr but not both. 
        !          1104: If the car is non-nil then
        !          1105: if  the  value  of  the expression held in 
        !          1106: v-form  is  non-nil, a branch should be done to the symbol
        !          1107: stored  in  \fI(car\ g-cc)\fP. 
        !          1108: If the cdr is non-nil  then if the value of v-form
        !          1109: is  nil,  a branch should be done to the symbol stored in 
        !          1110: \fI(cdr\ g-cc)\fP. Since at
        !          1111: every  conditional  
        !          1112: branch  control should  either jump or continue, the car and cdr will
        !          1113: never both be specified.
        !          1114: .ip   \fIg-ret\fP
        !          1115: - is non nil if the result of evaluating v-form will be returned as the
        !          1116: value of the function we are evaluating. 
        !          1117: This is used to allow liszt to detect
        !          1118: when tail recursion removal is possible.
        !          1119: .ip  \fIg-locs\fP
        !          1120: - maintains current information about the state of the stacks:
        !          1121: the bindstack (for specials), the namestack (for locals) and the
        !          1122: C runtime stack (for catch frames)
        !          1123: The form of g-locs is a stack of tokens stored as a list.
        !          1124: The meaning of each token is as follows:
        !          1125: .br
        !          1126: \fInil\fP - this represents an unnamed object on the namestack.  This happens
        !          1127: when calling functions, when the arguments are stacked prior
        !          1128: to a function call.
        !          1129: .br
        !          1130: \fI<symbol>\fP - the given symbol is the name of a local variable on the namestack.
        !          1131: .br
        !          1132: \fI(prog . <n>)\fP - prog statement which binds <n> special variables
        !          1133: .br
        !          1134: \fI(do . <n>)\fP - head of a do statement which binds <n> special variables
        !          1135: .br
        !          1136: \fI(catcherrset . 0)\fP - catch or errset frame on stack
        !          1137: .br
        !          1138: \fI(lambda . <n>)\fP - lambda expression which binds <n> special variables
        !          1139: .ip \fIg-labs\fP
        !          1140: - this is a stack of labels. 
        !          1141: There is one entry in g-labs for every entry which is a list
        !          1142: in g-locs.
        !          1143: the forms of the entries are:
        !          1144: .br
        !          1145: \fInil\fP - no labels in this form
        !          1146: .br
        !          1147: \fI(endlab (real . gen) (real2 . gen2) ...)\fP - endlab is the label to go to
        !          1148: to get out of this body.  After this label will be code
        !          1149: to unbind specials and pop off locals.
        !          1150: The 'real' labels are those found in the code. the gen
        !          1151: labels are those generated and put in the assembler code.
        !          1152: .sh +0 Function\ compilation
        !          1153: .pp
        !          1154: The action d-exp takes when compiling an expression depends on the
        !          1155: type of expression.
        !          1156: For atoms (symbols and numbers) the compilation is very simple, it
        !          1157: is just a matter of putting the value where g-loc specifies and 
        !          1158: then jumping if specified by g-cc.
        !          1159: When the expression is a list, d-exp first macro
        !          1160: expands the form and then looks at the first element of 
        !          1161: the list (if the list has not macro expanded to an atom).
        !          1162: If the first element is a symbol  then the list is
        !          1163: is a function call.
        !          1164: If the function is one of the functions which liszt knows how to open compile
        !          1165: then liszt will call the particular routine designated to open
        !          1166: compile this function.
        !          1167: There are two classes of functions within liszt that
        !          1168: do open compiling.
        !          1169: The first class, the fl-expr class, are distinguished by names which
        !          1170: begin with c-.
        !          1171: These functions always generate
        !          1172: code which returns the result in r0.
        !          1173: They are not equipped to obey the contents of g-loc and g-cc.
        !          1174: Thus d-exp, after calling one of these functions, must do what
        !          1175: is necessary to obey g-loc and g-cc.
        !          1176: The other class of open compiling functions, the fl-exprcc class (whose
        !          1177: names begin with cc-),
        !          1178: handle g-loc and g-cc.
        !          1179: As a result these are usually much more complex and generate better code.
        !          1180: .sh -1 Register\ Conventions
        !          1181: .pp
        !          1182: The register conventions used by liszt are an extension of those
        !          1183: used by the C code.
        !          1184: .ip \fIr0\fP
        !          1185: - return value placed here
        !          1186: .ip \fIr1,r2,r3,r4\fP
        !          1187: - scratch registers.
        !          1188: When long strings of 
        !          1189: .i car's
        !          1190: or
        !          1191: .i cdr's
        !          1192: are done (such as
        !          1193: .i caddadr )
        !          1194: these registers are used in a least-recently-used fashion to access down
        !          1195: the list. 
        !          1196: The compiler keeps track of the values in these registers but must
        !          1197: assume that they are garbage after a function is called.
        !          1198: .ip \fIr5\fP
        !          1199: - fixnum accumulator.
        !          1200: There a number of functions which work on fixnum's only and these
        !          1201: usually put their result in r5.
        !          1202: The assembler code function 
        !          1203: .i qnewint
        !          1204: which returns a pointer to a cell containing a fixnum value (after
        !          1205: checking if it is in the fixnum table), expects its argument to be
        !          1206: in r5.
        !          1207: .ip \fIr6\fP
        !          1208: np
        !          1209: .ip \fIr7\fP
        !          1210: lbot.
        !          1211: When calling a function, this register is set just before the function
        !          1212: call, and after the function call its value is used to reset the value
        !          1213: of np in order to pop the arguments off the namestack.
        !          1214: .ip \fIr8\fP
        !          1215: the literal table pointer.
        !          1216: The compiler generates a table of all the literal lisp data
        !          1217: which the compiled code might access.
        !          1218: Upon function entry a pointer to the base of this table is put in r8.
        !          1219: For example, if we compile \fI(setq\ x\ 'foo)\fP then we will
        !          1220: need a pointer to the lisp symbol
        !          1221: .i foo
        !          1222: and if the symbol
        !          1223: .i x
        !          1224: as been declared special we will also need a pointer to 
        !          1225: .i x .
        !          1226: .ip \fIr10\fP
        !          1227: - upon function entry the value of lbot (r7) is put in r10.  
        !          1228: This allows us to reference the arguments to our function while
        !          1229: using lbot to call other function.
        !          1230: .sh +0 Addresses
        !          1231: There are four types of addresses used internally in Franz: 
        !          1232: symbolic, intermediate addresses (iadr), extended intermediate (eiadr)
        !          1233: and vax assembler format.
        !          1234: .sh +1 Symbolic
        !          1235: .pp
        !          1236: These are the names of lisp objects, such as `a', `foo', 34,
        !          1237: or 3.234234.
        !          1238: A name is either a local variable, a special variable or a
        !          1239: number.  A number is either a small fixnum, large fixnum,
        !          1240: bignum or floating point number.
        !          1241: .sh +0 Intermediate\ address\ (IADR)
        !          1242: .pp
        !          1243: This type of address is generated from a symbolic address by
        !          1244: .i d-loc, 
        !          1245: .i d-loclit 
        !          1246: and the routines 
        !          1247: .i d-simple
        !          1248: and 
        !          1249: .i d-rsimple 
        !          1250: which
        !          1251: call them. The forms are
        !          1252: .ip \fINil\fP
        !          1253: - the location or value of nil.
        !          1254: .ip \fIT\fP   
        !          1255: - the location or value of t.
        !          1256: .ip \fIreg\fP
        !          1257: - register r0, which is where the result of function calls
        !          1258: are stored.
        !          1259: .ip \fIstack\fP
        !          1260: - the address pointed to by np with np incremented after
        !          1261: the value is stored.  (i.e  (r6)+)
        !          1262: .ip \fIunstack\fP
        !          1263: - the address one word below np (np is decremented before
        !          1264: accessing. (i.e. -(r6))
        !          1265: .ip \fI<symbol>\fP
        !          1266: - this is just <symbol>.  This allows strange forms to be
        !          1267: represented directly.
        !          1268: .ip \fI(stack\ n)\fP
        !          1269: - The n'th value on the namestack for this function.
        !          1270: The first value 0(r10) is (stack 1).
        !          1271: .ip \fI(vstack\ n)\fP
        !          1272: - The cdr of the n'th value on the namestack.  
        !          1273: That is, (stack 1) is *0(r10)
        !          1274: .ip \fI(bind\ n)\fP
        !          1275: - The value of the n'th value in 
        !          1276: the literal table.  If 
        !          1277: this refers to a symbol, then this is the value
        !          1278: of the symbol.
        !          1279: If this refers to a list then this
        !          1280: this is the cdr of the list (although this is a rare
        !          1281: use). (bind 1) corresponds to *0(r8).
        !          1282: .ip \fI(lbind\ n)\fP
        !          1283: - The n'th value in the literal table.  If 
        !          1284: this refers to 
        !          1285: object foo then this is the address of foo
        !          1286: in memory.
        !          1287: .ip \fI(fixnum\ n)\fP
        !          1288: - This is the address of small fixnum n in memory.
        !          1289: A small fixnum is in the range -1024 to 1023.
        !          1290: .ip \fI(immed\ n)\fP
        !          1291: - The is the immediate constant n. 
        !          1292: .sh +0 extended\ intermediate\ address\ (EIADR)
        !          1293: .pp
        !          1294: This address is generated from an IADR by e-cvt.  It 
        !          1295: symbolically represents a vax address.  
        !          1296: .ip \fI<atom>\fP
        !          1297: - This corresponds to <atom>.
        !          1298: .ip \fI(n\ <atom>)\fP
        !          1299: - This corresponds to n(<atom>)
        !          1300: (stack n+1) and (lbind n+1) are converted to these forms.
        !          1301: .ip \fI(n\ <atom1>\ <atom2>)\fP
        !          1302: - This corresponds to n(<atom1>)[<atom2>]
        !          1303: There is currently no IADR which generates this.
        !          1304: .ip \fI(*\ n\ <atom>)\fP
        !          1305: -This corresponds to *n(<atom>)
        !          1306: (vstack n+1) and (bind n+1) are converted to these forms.
        !          1307: .ip \fI(+\ <atom>)\fP
        !          1308: - This corresponds to (<atom>)+.
        !          1309: stack is converted to this form.
        !          1310: .ip \fI(-\ <atom>)\fP
        !          1311: - This corresponds to -(<atom>)
        !          1312: unstack is converted to this form.
        !          1313: .ip \fI($\ <numb>)\fP
        !          1314: - This corresponds to $<atom>
        !          1315: (immed <numb>) is converted to this form.
        !          1316: .ip \fI(#\ <numb>)\fP
        !          1317: - This corresponds to $<loc> where <loc> equals 
        !          1318: the base of the fixnums (0x1400) plus 4 * <numb>
        !          1319: (fixnum <numb>) is converted to this form
        !          1320: .ip \fI(*\ #\ <numb>)\fP
        !          1321: - This corresponds to $<numb>.  It is generated
        !          1322: by d-rsimple occasionally when you ask for the
        !          1323: cdr of a number (which you do sometimes when you
        !          1324: are compiling fixnum operators).
        !          1325: .sh +0 Vax\ Unix\ assembler\ addresses
        !          1326: .pp
        !          1327: The addresses are printed from a EIADR by e-cvtas.  The conversions
        !          1328: are shown in the above section.
        !          1329: .sh -1 Function\ calling\ convention
        !          1330: .sh +1 Function\ linkages
        !          1331: .pp
        !          1332: The function associated with a symbol is stored in the function 
        !          1333: definition slot of the symbol.  If the function slot contains a
        !          1334: list then the function is to be interpreted and its 'car' will 
        !          1335: be lambda, nlambda, lexpr or macro.  If the function is compiled then 
        !          1336: the function definition slot will contain a binary object. 
        !          1337: A binary object 
        !          1338: is composed of two independent parts: the discipline and the entry.
        !          1339: The discipline is one of:
        !          1340: .ip \fIlambda\fP
        !          1341: - a function whose arguments should be evaluated.
        !          1342: .ip \fInlambda\fP
        !          1343: - a function whose arguments should not be evaluated but
        !          1344: which should be passed as a list
        !          1345: .ip  \fImacro\fP
        !          1346: - a function which should be passed the unevaluated form
        !          1347: being evaluated and whose result should be evaluated.
        !          1348: .ip \fI\"subroutine\"\fP
        !          1349: - a foreign function subroutine
        !          1350: .ip  \fI\"integer-function\"\fP
        !          1351: - a foreign function returning an integer
        !          1352: .ip \fI\"real-function"\fP
        !          1353: - a foreign function returning a flonum.
        !          1354: .pp   
        !          1355: A lexpr is not in the list as there is no difference to the caller
        !          1356: between compiled lambda's and compiled lexprs.
        !          1357: .sh +0 Transfer\ tables
        !          1358: Most calls from compiled code to other lisp functions go through
        !          1359: transfer tables.  A transfer table entry is a pair: symbol, address
        !          1360: of routine. 
        !          1361: When another lisp function is called it uses the 
        !          1362: .i calls 
        !          1363: instruction which will
        !          1364: indirectly jump through the address in the transfer table.  This
        !          1365: address may point to the desired function or it may point to the
        !          1366: interface routine.  If a call ends up in the interface routine, 
        !          1367: then that routine will trace back through the call stack and eventually
        !          1368: reach the tranfer table entry that it was 'called through'. Since the
        !          1369: transfer table entry contains a symbol which names the function
        !          1370: to be called, the interface routine can determine which routine
        !          1371: was to have been called.  If that routine is compiled then the 
        !          1372: interface routine can modify the transfer table so that a call
        !          1373: through that table entry will go directly to the desired function.
        !          1374: If the routine to be called is interpreted, or if the user has
        !          1375: requested that transfer linkages should be disabled, then the
        !          1376: interface routine will go through the 'funcall' function 
        !          1377: in the interpreter to continue the call.  
        !          1378: .sh +0 calling\ sequence\ in\ the\ compiler:
        !          1379: .pp
        !          1380: When transfer tables are used
        !          1381: .(b
        !          1382:        \fBforeach\fP arg
        !          1383:         \fBcompute\fP arg and stack result using (r6)+
        !          1384:         for example: movl r0,(r6)+
        !          1385:        movab -n(r6),r7    where n = 4*number of args to fcn
        !          1386:        calls  $0,*trantb+m      where m is the index of the function
        !          1387:                                 in the transfer table.
        !          1388:        movl r7,r6              restore r6
        !          1389: .)b
        !          1390: .pp
        !          1391: The compiler supports local functions, which are function accessible
        !          1392: only within one file.
        !          1393: Because they are not accessible from C code, we can use a very fast
        !          1394: call and return sequence when calling them.
        !          1395: To call a local function
        !          1396: .(b
        !          1397:        \fBforeach\fP arg
        !          1398:         \fBcompute\fP and stack using (r6)+
        !          1399:        jsb     LOCALNAME       go directly to the function, it will 
        !          1400:                                restore r6 before it returns.
        !          1401: .)b
        !          1402: .pp
        !          1403: The beginning of each function looks as follows. 
        !          1404: First for a non-lexpr function called in the standard way
        !          1405: (topsim is the label jumped to when tail merging, it will be unique
        !          1406: for each function; the brackets indicate the optional code which
        !          1407: exists if the -p switch is given to liszt);
        !          1408: .(b
        !          1409:        .word   0x5c0           # save r6, r7, r8, r10
        !          1410:     [   movab  mcounts,r0      # if profiling, mcounts replaced by fasl
        !          1411:        jsb     mcount  ]
        !          1412:        movab   linker,r8       # set up r8
        !          1413:        movl    r7,r10          # set up oldlbot
        !          1414:        movab   n(r10),r6       # n = 4*Number of args expected.
        !          1415: topsim:
        !          1416: .)b
        !          1417: .pp
        !          1418: Now for lexprs:
        !          1419: .(b
        !          1420:        .word   0x5c0
        !          1421:     [   movab  mcounts,r0      # if profiling. [mcounts replaced by fasl]
        !          1422:        jsb     mcount  ]
        !          1423:        movab   linker,r8       # set up r8
        !          1424:        subl3   $4,r7,-(sp)     # address one word below bottom of args
        !          1425: topsim:
        !          1426:        movl    r6,r10          # first free addr to arg base
        !          1427:        subl3   r7,r6,r0        # number of args * 4 into r0
        !          1428:        movab   0x1400(r0),(r6)+ # stack boxed number of args
        !          1429:        movl    0(r10),-(sp)    # also store on stack so user can't clobber
        !          1430: .)b
        !          1431: .pp
        !          1432: And finally for local functions:
        !          1433: .(b
        !          1434:        movl    r10,-(sp)       # save r10
        !          1435:        movab   -n(r6),r10      # set up arg base using arg top
        !          1436: topsim:
        !          1437: .)b
        !          1438: .sh -1 Assembler\ file\ format
        !          1439: .pp
        !          1440: The liszt compiler generates a file which is in Unix assembler format.
        !          1441: The Unix assembler converts that file into an object file which fasl
        !          1442: then reads.
        !          1443: Although the object file generated is a standard Unix object
        !          1444: file (as defined in /usr/include/a.out.h), it is not of a 
        !          1445: format that  the Unix ld loader can understand.
        !          1446: This is because the requirements  of a lisp object file are different
        !          1447: from an object file of other languages.
        !          1448: The 
        !          1449: run time semantics of lisp and the fact that lisp data must be protected
        !          1450: from garbage collection are the principal differences.
        !          1451: The unconventional object file created by the unix assembler 
        !          1452: is a result of the
        !          1453: unconventional assembler input file.
        !          1454: Next we will look at what must be put in the assembler file and how
        !          1455: it is put there.
        !          1456: .pp
        !          1457: The assembler file must contain
        !          1458: .ip \fIinstructions\fP
        !          1459: - vax assembler code for the compiled functions. 
        !          1460: If there aren't any functions compiled, this can be empty.
        !          1461: .ip \fIliterals\fP
        !          1462: - lisp data which is referred to by compiled code
        !          1463: .ip \fItransfer\ table\fP
        !          1464: - the names of the functions which correspond to the calls through
        !          1465: the transfer table.
        !          1466: The instructions simply say 'call indirect through the nth transfer
        !          1467: table entry'.
        !          1468: .ip \fIfunction\ names\fP
        !          1469: - the names of the functions which are being 
        !          1470: defined.
        !          1471: .ip \fIload\ time\ forms\fP
        !          1472: - in order to mimic the 
        !          1473: .i load
        !          1474: function, fasl has to be able to evaluate lisp expressions at fasl 
        !          1475: time, so we must be able to store lisp expressions in the assembler
        !          1476: file and indicate when they should be evaluated.
        !          1477: .pp
        !          1478: Based on the requirements above, the form of the assembler file
        !          1479: is as described below.
        !          1480: The assembler  builds two segments: text and data.
        !          1481: We put all of our information in the text segment.
        !          1482: The compiler places some annotation strings in the data segment
        !          1483: so that the object file can be identified, however the data segment
        !          1484: is ignored by fasl.
        !          1485: The format is
        !          1486: .ip \fIcompiled\ instructions\fP
        !          1487: The instructions for each compiled  (non-local) function begins with
        !          1488: .(b
        !          1489:     .globl F00003
        !          1490:     F00003:
        !          1491:    .word 0x5c0
        !          1492: .)b
        !          1493: The globl declaration and the fact that the symbol name begins
        !          1494: with a F will tell fasl that this is the beginning of a lisp
        !          1495: function.
        !          1496: The symbols beginning with F must be unique within a file
        !          1497: but may be duplicated in other files.
        !          1498: The lisp  name of the function will appear later.
        !          1499: Next the instructions for the function are given.
        !          1500: Only a small fixed set of external symbols may be referenced.
        !          1501: The list is found in the code for nfasl.c and the common
        !          1502: ones will be described below (soon).
        !          1503: Labels should be given a name beginning with L and should
        !          1504: be unique within the file.
        !          1505: .ip \fItable\ sizes\fP
        !          1506: somewhere in the file there should be a pair of 'set' assembler
        !          1507: pseudo ops which describe the sizes of the literal table and
        !          1508: transfer table.
        !          1509: The form is this
        !          1510: .(b
        !          1511:     .set linker_size  4
        !          1512:     .set trans_size 3
        !          1513: .)b
        !          1514: where linker_size is the number of entries in the literal table
        !          1515: which will be required and trans_size is the number of entries
        !          1516: in the transfer table which will be required.
        !          1517: Those tables will be built by  fasl.
        !          1518: .ip \fIbinder\ table\fP
        !          1519: - this table describes  the order that the functions should be
        !          1520: defined and forms evaluated.
        !          1521: It is a table of longwords  beginning at the symbol bind_org
        !          1522: and ending when a -1 entry is seen.
        !          1523: The meaning of the entries will be described below.
        !          1524: .ip \fIlisp\ expression\ table\fP
        !          1525: - this is a table of null terminated strings beginning at the 
        !          1526: symbol lit_org and ending at the symbol lit_end.
        !          1527: Each string is read by the lisp read function (using the
        !          1528: raw readtable).
        !          1529: The first linker_size expressions are put in the literal table.
        !          1530: The next trans_size expressions are the names  of the
        !          1531: functions to put in the transfer table.
        !          1532: The remaining expressions correspond to the entries in the
        !          1533: binder table.
        !          1534: The binder entries are processed, one by one. 
        !          1535: Provided that the binder entry is not -1,  an expression
        !          1536: is read from the lisp expression table.
        !          1537: Then if the binder table entry is 0, that expression is the name of
        !          1538: a lambda type lisp function.
        !          1539: A binary object is created, the discipline is set to lambda and
        !          1540: the function address is set to the lexically next function 
        !          1541: defined in the
        !          1542: file.
        !          1543: If the binder entry is 1 then this is an nlambda function, and
        !          1544: if the entry is 2 then this is a macro function.
        !          1545: If the entry is 99 then the expression just read should be evaluated
        !          1546: by the lisp function eval.
        !          1547: .pp
        !          1548: The lisp compiler normally puts the assembler format output file in /tmp
        !          1549: and removes it when it is done. 
        !          1550: The -S switch will tell liszt to write the 
        !          1551: assembler file in the same place as
        !          1552: the source file, and with the same root name but a .s extension.
        !          1553: The -C file will tell the lisp compiler to comment the file as it
        !          1554: generates it, making it easier to understand what is going on.
        !          1555: Assume that the following is file foo.l:
        !          1556: .(b
        !          1557: (defun foo (x) (bar y) (baz k))
        !          1558: (print (foo 3))
        !          1559: (def test (nlambda (l) (print 'hithere) (foo 3)))
        !          1560: .)b
        !          1561: we now compile it with -SC
        !          1562: .(b
        !          1563:     .globl     F00007  #(fcn lambda foo)
        !          1564:     F00007:
        !          1565:     .word      0x5c0
        !          1566:     movab      linker,r8
        !          1567:     movl       r7,r10
        !          1568:     movab      4(r10),r6
        !          1569:     L00008:
        !          1570:     movl       *0(r8),(r6)+    #(calling bar)
        !          1571:            #(from y to stack)
        !          1572:     movab      -4(r6),r7
        !          1573:     calls      $0,*trantb+0
        !          1574:     movl       r7,r6
        !          1575:     movl       *4(r8),(r6)+    #(calling baz)
        !          1576:            #(from k to stack)
        !          1577:     movab      -4(r6),r7
        !          1578:     calls      $0,*trantb+8
        !          1579:     movl       r7,r6
        !          1580:     ret
        !          1581:     .globl     F00009  #(fcn nlambda test)
        !          1582:     F00009:
        !          1583:     .word      0x5c0
        !          1584:     movab      linker,r8
        !          1585:     movl       r7,r10
        !          1586:     movab      4(r10),r6
        !          1587:     L00010:
        !          1588:     movl       8(r8),(r6)+     #(calling print)
        !          1589:            #(from 'hithere to stack)
        !          1590:     movab      -4(r6),r7
        !          1591:     calls      $0,*trantb+16
        !          1592:     movl       r7,r6
        !          1593:     movl       $5132,(r6)+     #(calling foo)
        !          1594:            #(from (fixnum 3) to stack)
        !          1595:     movab      -4(r6),r7
        !          1596:     calls      $0,*trantb+24
        !          1597:     movl       r7,r6
        !          1598:     ret
        !          1599:     bind_org:
        !          1600:     .set linker_size,  3
        !          1601:     .set trans_size,   4
        !          1602:     .long      0
        !          1603:     .long      99
        !          1604:     .long      1
        !          1605:     .long -1
        !          1606:     lit_org:
        !          1607:     .asciz "y"
        !          1608:     .asciz "k"
        !          1609:     .asciz "hithere"
        !          1610:     .asciz "bar"
        !          1611:     .asciz "baz"
        !          1612:     .asciz "print"
        !          1613:     .asciz "foo"
        !          1614:     .asciz "foo"
        !          1615:     .asciz "(print (foo 3))"
        !          1616:     .asciz "test"
        !          1617:     lit_end:
        !          1618:     .data  # this is just for documentation 
        !          1619:     .asciz "@(#)Compiled by Lisp Compiler 7.1 on Sun Feb 21 17:51:54 1982"
        !          1620:     .asciz "@(#)decl.l 1.5     2/10/82"
        !          1621:     .asciz "@(#)array.l        1.1     9/25/81"
        !          1622:     .asciz "@(#)datab.l        1.1     9/25/81"
        !          1623:     .asciz "@(#)expr.l 1.1     9/25/81"
        !          1624:     .asciz "@(#)io.l   1.1     9/25/81"
        !          1625:     .asciz "@(#)funa.l 1.3     2/10/82"
        !          1626:     .asciz "@(#)funb.l 1.2     2/10/82"
        !          1627:     .asciz "@(#)func.l 1.2     2/10/82"
        !          1628:     .asciz "@(#)tlev.l 1.4     10/24/81"
        !          1629:     .asciz "@(#)fixnum.l       1.6     10/21/81"
        !          1630:     .asciz "@(#)util.l 1.2     10/7/81"
        !          1631: .)b
        !          1632: .sh +0 functions\ callable\ from\ compiled\ lisp\ code
        !          1633: .sh +0 Object\ File\ Format
        !          1634: .sh +0 Fasl
        !          1635: 
        !          1636: 
        !          1637: 
        !          1638: 
        !          1639:
unix.superglobalmegacorp.com
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.