|
|
1.1 ! root 1: .\" Copyright (c) 1980 Regents of the University of California. ! 2: .\" All rights reserved. The Berkeley software License Agreement ! 3: .\" specifies the terms and conditions for redistribution. ! 4: .\" ! 5: .\" @(#)franz.n 6.1 (Berkeley) 4/29/86 ! 6: .\" ! 7: ." franz.n -[Thu Jun 17 11:01:27 1982 by jkf]- ! 8: ." this file will contain a description of the franz lisp system. ! 9: ." sort of a systems manual. ! 10: .de Fr ! 11: F\s-2RANZ\s0 L\s-2ISP\s0 ! 12: .. ! 13: .tp ! 14: .(l C ! 15: .sz +2 ! 16: \fBThe Franz Lisp System\fP ! 17: .sz -2 ! 18: .sp 2v ! 19: \fIby ! 20: \fIJohn K. Foderaro\fP ! 21: .)l ! 22: .sp 3v ! 23: .tl ''Abstract'' ! 24: .br ! 25: This document describes the ! 26: .Fr ! 27: system written at The University of California ! 28: at Berkeley. ! 29: Included are descriptions of the memory management, interpreter ! 30: and compiler, the conventions used within the C coded kernel ! 31: and those within the compiled code. ! 32: This does not duplicate the information found in The Franz Lisp Manual. ! 33: .++ C '\s-2Draft\s+2'\fBThe\ Franz\ Lisp\ System\fP'%' ! 34: .+c Characteristics\ of\ \F\s-2RANZ\ \s0L\s-2ISP\s0 ! 35: .ls 1 ! 36: .pp ! 37: There is no formal standard for lisp systems, thus each lisp ! 38: system is almost guaranteed to be different from any other lisp system. ! 39: In this section we will examine the design decisions which most often ! 40: characterize a lisp system. ! 41: This ! 42: focus does not imply that all other parts of the language are generally ! 43: agreed upon. ! 44: In fact, there is nothing sacred to lisp system designers. ! 45: For example, one usually identifies the symbol ! 46: .i nil ! 47: with the empty list, ! 48: and one usually thinks of lisp as a dynamically scoped language. ! 49: Both of these conventions are are not true in the lisp dialect ! 50: .i NIL ! 51: currently being written at MIT. ! 52: As another example, one imagines that a lisp system must ! 53: use garbage collection to reclaim storage. ! 54: The lisp dialect ! 55: .i ZetaLisp ! 56: that is running on Lisp Machines doesn't ! 57: normally use a garbage collector ! 58: because of the way in which it allocates its address space. ! 59: It is faster to reboot the machine than to do a garbage collection. ! 60: In most lisp dialects the lisp expressions are written in ! 61: parenthesised form. In ! 62: .i Portable ! 63: .i Standard ! 64: .i Lisp ! 65: (PSL) at the University of Utah, programs are written ! 66: primarily ! 67: in an Algol-like ! 68: syntax. ! 69: .pp ! 70: Thus some of the discussion to follow indicates not so much ! 71: .i unique ! 72: charateristics of ! 73: .Fr ! 74: but instead, how decisions were made. ! 75: .sh 2 Scoping\ and\ Binding ! 76: .pp ! 77: The ! 78: .Fr ! 79: interpreter ! 80: uses ! 81: .i dynamic ! 82: .i scoping , ! 83: that is, after A calls B, B can access all of the ! 84: variables that A allocated ! 85: as well ! 86: as those that A can access, provided B doesn't allocate variables of the same ! 87: names. ! 88: There are two popular ways of implementing dynamic scoping: ! 89: .i deep ! 90: .i binding ! 91: and ! 92: .i shallow ! 93: .i binding . ! 94: Note that we will use the terms variable allocation and lambda ! 95: binding interchangeably in this report. ! 96: The former term is less language-specific than the latter. ! 97: .sh +1 Deep\ Binding ! 98: .pp ! 99: In a deep binding implementation, when a variable is allocated, ! 100: the name of the variable and a space for its value are pushed on the ! 101: top of a stack. ! 102: When a program asks for the value of a variable, ! 103: the lisp system must search down the stack for the first occurrence of ! 104: the variable. ! 105: This system offers great flexibility for the ! 106: programmer since the state of his program ! 107: can be described by the contents of the stack. ! 108: It is thus possible to save the context of a program by just saving ! 109: the stack or in some cases just a pointer to ! 110: a place in the stack. ! 111: The problem with deep binding is that it is time-consuming to search ! 112: down the stack for the value of a variable, and, ! 113: as a result, most systems ! 114: modify the deep binding model by giving variables a global value ! 115: cell and allowing programs to set and access the global cell. ! 116: Some implementations of Interlisp use deep binding. ! 117: .sh +0 Shallow\ Binding ! 118: .pp ! 119: In a shallow binding implementation, each variable has a global value cell ! 120: which contains the current value of the variable. ! 121: When a variable is allocated inside a program, its name ! 122: and old value are pushed on a stack called a ! 123: .i bindstack . ! 124: The variable is then given a new value. ! 125: Throughout the lifetime of the allocation, the current value of the ! 126: variable can be found in the global value cell associated with the variable. ! 127: When leaving the context of the variable's allocation, the old value ! 128: is restored from the ! 129: .i bindstack . ! 130: A shallow binding scheme makes it much cheaper to access ! 131: the values of variables, however it is much more difficult to ! 132: save the state and to restore it. ! 133: .pp ! 134: .Fr ! 135: and most other lisps which are not derived from Interlisp ! 136: use shallow binding. ! 137: Some versions of Interlisp use shallow binding. ! 138: .sh -1 Lisp\ Compiler ! 139: .pp ! 140: Dynamic scoping often is not necessary and it is never cheap, even ! 141: in a shallow binding implementation. ! 142: Thus the ! 143: .Fr ! 144: compiler prefers to do lexical scoping rather than dynamic ! 145: scoping. ! 146: If the user does not specify otherwise, when a ! 147: function is compiled, all variables ! 148: allocated will be ! 149: put on a local stack and will ! 150: .i not ! 151: be accessible by any ! 152: function that this function calls. ! 153: This ! 154: convention results in faster code being generated by ! 155: the compiler, but if the ! 156: user is not careful, it can result in compiled and interpreted code ! 157: working differently. ! 158: In some of the new lisp designs, such as ! 159: .i NIL ! 160: and ! 161: .i Common ! 162: .i Lisp ! 163: the semantics of compiled and interpreted code have been ! 164: unified by ! 165: transferring the compiler semantics (lexical scoping) to the interpreter. ! 166: This is a rather large step ! 167: if dynamic scoping is no ! 168: longer an option, and it is not clear whether the resulting ! 169: language should be called 'lisp'. ! 170: .sh +0 Data\ Types ! 171: .pp ! 172: A complete list of data types in ! 173: .Fr ! 174: can be found in the first chapter of the ! 175: .Fr ! 176: Manual. ! 177: The most important ones are described below. ! 178: .pp ! 179: Lisp is a list processing language and the basic data type is the list ! 180: cell. ! 181: A list cell also goes by the name of cons cell, pair, or dotted pair (dtpr). ! 182: It contains pointers to two lisp objects, and these pointers can ! 183: be accessed via the ! 184: .i car ! 185: and ! 186: .i cdr ! 187: functions. ! 188: Each pointer requires four (8-bit) bytes and thus the list cell is ! 189: eight bytes large. ! 190: The cdr pointer is stored first since this makes it easier ! 191: to 'cdr down a list' using the addressing modes of the VAX. ! 192: .pp ! 193: The symbol cell is used to implement variables. It contains a pointer ! 194: to a string that is the print name, a cell to store a value, a cell to ! 195: store the function definition, a cell to store a pointer to the property ! 196: list and one pointer that the list system uses if it stores ! 197: the symbol in the oblist. ! 198: .sh +0 Memory\ Management ! 199: .pp ! 200: A lisp system must be able to determine the type of an ! 201: object at run time. ! 202: The method used to determine the type influences the ! 203: way storage is managed and garbage collection is done. ! 204: Next, we will look at three possible places ! 205: to store the type information. ! 206: .sh +1 Typed\ data ! 207: .pp ! 208: The type of the data object could be placed in the object itself, much ! 209: as Pascal stores the type of a variant record in the record itself. ! 210: This could result is a large amount of storage being used to store ! 211: type information. ! 212: No lisp system that we know of uses this method exclusively. ! 213: .sh +0 Typed\ pointers ! 214: .pp ! 215: Every reference to a lisp object could contain the type of the object ! 216: referenced. ! 217: This is a good idea in a machine like an IBM 370 where only ! 218: part of machine word is used by the addressing hardware. ! 219: Lisps that use typed pointers generally use a ! 220: .i heap ! 221: memory management scheme and a ! 222: .i copying ! 223: garbage collector. ! 224: In a heap scheme, ! 225: all memory is divided by a pointer (called the ! 226: heap pointer) separating lisp ! 227: data and free space. ! 228: When a new lisp object is needed, it is formed at the ! 229: dividing line and then ! 230: the heap pointer is incremented ! 231: past the new object. ! 232: Garbage collection is done by maintaining two separate spaces for lisp ! 233: data, only one of which is used at any one time. ! 234: When one fills up, all active lisp objects are copied to the ! 235: other space, leaving the first space totally free. ! 236: Subsequent allocations are done from the second space, until it fills up, ! 237: at which point ! 238: all active data in the second space is copied to the first space. ! 239: The advantage of the copying garbage collector is that the data space ! 240: is compacted, which will improve virtual memory behavior. ! 241: The disadvantage is that ! 242: half the data space is always unused. ! 243: .pp ! 244: PSL on a PDP-10 uses this scheme, as does Lisp 370 on ! 245: an IBM 370. ! 246: PSL and NIL on the VAX will also use this scheme. ! 247: Since the VAX hardware ! 248: uses the entire word for address ! 249: calculation, ! 250: PSL and NIL ! 251: must mask off the type bits ! 252: of a pointer before using it to reference memory. ! 253: .sh +0 Typed\ pages ! 254: .pp ! 255: The final alternative is to allocate data objects by pages rather than ! 256: individually. ! 257: Each page contains only one type of object and there is a table that ! 258: keeps track of the type of data object in each page. ! 259: The address of an object tells which page the object is on and ! 260: the page number tells which type the object is. ! 261: Since a whole page of objects is allocated when only one ! 262: object is necessary, ! 263: the rest of the objects in the page are put on a free list. ! 264: This method of allocation is sometimes referred to as ! 265: .i bibop ! 266: for BIg Bag Of Pages. ! 267: The garbage collector for bibop is usually ! 268: a traditional ! 269: .i mark ! 270: and ! 271: .i sweep ! 272: algorithm. ! 273: All active data objects are marked and ! 274: then each page is swept linearly with ! 275: the free cells being put on the free list and the used cells ! 276: left alone. ! 277: The advantage of mark and sweep over a copying garbage collector is that ! 278: the mark phase is cheaper because data objects do not have to be copied. ! 279: Also, all of memory can be used since there is no requirement for two spaces. ! 280: The disadvantage is that a sweep phase is required in a mark and sweep ! 281: but one is not required in a copying garbage collector. ! 282: The sweep phase is expensive because it has to alter most data pages ! 283: while building a free list. ! 284: This operation ! 285: can be expensive in a virtual memory environment. ! 286: Alternatives to the sweep phase have been proposed in ! 287: [Foderaro+Fateman Characterization of Vax Macsyma]. ! 288: .pp ! 289: .Fr ! 290: uses a bibop memory allocator and a mark and sweep garbage collector, ! 291: as does Maclisp (on a PDP-10). ! 292: The reason that ! 293: .Fr ! 294: uses bibop is primarily due to the VAX architecture, ! 295: which makes typed pointers ! 296: expensive to implement. ! 297: Also, typed pointers would make it difficult to pass lisp values ! 298: to programs written in other languages, some of which may not have ! 299: the ability to extract the address of a lisp object from a typed pointer. ! 300: .sh -1 Construction ! 301: .pp ! 302: The ! 303: .Fr ! 304: system is built by adding lisp code to a C-coded kernel. ! 305: The C-coded kernel is a working lisp interpreter, although it has few of ! 306: the functions a lisp system needs. ! 307: The lisp code provides most of the functions, the top level and error ! 308: handlers. ! 309: The lisp compiler is also written in lisp, ! 310: but is usually ! 311: run as a separate process. ! 312: .sh +0 Unique\ features ! 313: .pp ! 314: .Fr ! 315: can dynamically load and execute functions defined in the other languages ! 316: which run on the VAX. ! 317: It uses two different dynamic loaders. ! 318: One is part of the lisp system and is used for lisp only, it is called ! 319: .i fasl ! 320: (for fast loader). ! 321: The other is the Unix loader, ld, which is used for loading in ! 322: C, Fortran or Pascal code. ! 323: Loading code with fasl ! 324: is much faster than with ld ! 325: since fasl benefits from ! 326: the simplicity of a lisp object file. ! 327: .+c Data\ Structures ! 328: .sh 0 _ ! 329: .nr $1 \n(ch ! 330: .pp ! 331: In this chapter we shall examine the data structures which are most ! 332: important to the ! 333: .Fr ! 334: system. ! 335: First, though we will see how the lisp system uses the address space. ! 336: .sh 2 Physical\ layout ! 337: .pp ! 338: As a Unix process, lisp has three address regions: text, data and stack. ! 339: Text begins at location 0 and continues up to 0x3b73c (0x means hex number). ! 340: The data segment begins on the next page boundary and continues ! 341: up to a limit set by the configuration parameters (currently 0x2fd000). ! 342: Lisp does not begin running with such a large data segment, rather it grows ! 343: when necessary. ! 344: The stack segment begins at address 0x7fffffff and grows downward to a ! 345: maximum size of one-half megabyte. ! 346: .pp ! 347: The text segment stores the instructions for the C kernel as well as read-only ! 348: data. ! 349: The read-only data for lisp are ! 350: the symbol nil, the i/o ports, and the small integer ! 351: table. ! 352: The symbol nil is stored at location 0 which makes it very easy to test ! 353: whether a value is nil. ! 354: The problem with storing nil in read-only space is that a special case ! 355: must be made for nil's property list, which is the only thing in the ! 356: nil symbol that the user is permitted to alter. ! 357: .pp ! 358: The fixnums -1024 through 1023 are stored sequentially, with 0 being ! 359: stored at 0x1400. ! 360: .Fr ! 361: doesn't have any 'immediate' lisp data, that is data whose value is ! 362: stored as part of the reference to the data. ! 363: But, by storing some of the fixnums in a known place, we can achieve ! 364: some of the benefits of immediate data: ! 365: A program can use ! 366: .i eq ! 367: as a test for a fixnum in the range -1024 to 1023. ! 368: In the majority of cases, when asked ! 369: to allocate a fixnum, the system can return a pointer into this table ! 370: and bypass the memory allocator. ! 371: .sh +0 Typetable ! 372: .pp ! 373: .Fr ! 374: uses the typed pages (or bibop) ! 375: method of type determination. ! 376: The ! 377: .i typetable ! 378: is an array of bytes ( ! 379: .i chars ! 380: in C lingo). ! 381: This table describes the type of all pages, from page 0 where nil is stored, ! 382: up to the end of data space. ! 383: Thus there are many entries that describe the types of the pages which ! 384: make up the C kernel. ! 385: In order to reduce the wasted space in the typetable, ! 386: we could have made the typetable begin typing pages at the start of data ! 387: space and make a special case of the pages that contain nil and ! 388: the small integer table. ! 389: However, the ! 390: effect of this change would be that type checks ! 391: (which are done in-line in compiled code) would ! 392: be longer and slower. Since type checking happens quite ! 393: frequently, we felt it was better to waste the space in the ! 394: typetable in order to save execution time and instruction space. ! 395: .pp ! 396: Each page on a VAX is 512 bytes, and thus to find the type of an ! 397: object given the address of it, ! 398: it is only necessary to shift the address right 9 bits ! 399: and index the typetable array offset by one. ! 400: The offset by one is required because the value -4, which is an illegal ! 401: address, is used as a sentinel value to indicate an illegal value. ! 402: Thus when we take the type of -4 we will be indexing the table by -1 ! 403: and we want to retrieve the first byte in the table (which contains ! 404: the value UNBOUND). ! 405: The C macro which retrieves the type is this (from file h/global.h): ! 406: .(b I ! 407: ! 408: #define TYPE(a1) ((typetable+1)[(int)(a1) >> 9]) ! 409: ! 410: .)b ! 411: This is code generated by the lisp compiler to check if the type ! 412: code of an argument (stored at 0(r10)) is a symbol (which is type ! 413: code 1): ! 414: .(b I ! 415: ! 416: ashl $-9,0(r10),r0 ! 417: cmpb _typetable+1[r0],$1 ! 418: ! 419: .)b ! 420: .pp ! 421: The type codes which are stored in the typetable are these: ! 422: .ts 2i 4i 6i ! 423: .(b I ! 424: UNBO -1\tSTRNG 0\tATOM 1\tINT 2 ! 425: DTPR 3\tDOUB 4\tBCD 5\tPORT 6 ! 426: ARRAY 7\tOTHER 8\tSDOT 9\tVALUE 10 ! 427: HUNK2 11\tHUNK4 12\tHUNK8 13\tHUNK16 14 ! 428: HUNK32 15\tHUNK64 16\tHUNK128 17 ! 429: .)b ! 430: The names given above are the C kernel internal names. ! 431: ATOM is symbol, INT is fixnum, DTPR is list cell, ! 432: DOUB is flonum, BCD is binary, ! 433: SDOT is bignum and all the HUNKn types are just known as hunk to ! 434: the user. ! 435: .sh +0 Stacks ! 436: .pp ! 437: .Fr ! 438: uses three stacks: the ! 439: .i C ! 440: .i runtime ! 441: stack, the ! 442: .i namestack ! 443: and the ! 444: .i bindstack . ! 445: The C runtime stack is the stack part of the address space ! 446: and the other two stacks are stored in the data space. ! 447: .sh +1 C\ runtime\ stack ! 448: The C-coded kernel uses this ! 449: stack in the same way as a typical C program ! 450: use the stack: ! 451: storing return addresses ! 452: and non-lisp-object arguments to subroutines, ! 453: saving registers, ! 454: and storing local variables within a function. ! 455: The lisp system stores ! 456: .i catch ! 457: .i frames ! 458: on this stack as well (to be described later). ! 459: .pp ! 460: The lisp system assumes that there are no lisp data on the stack and ! 461: thus the use of this stack ! 462: by a program is unrestricted. ! 463: As will be discussed later on, the ! 464: .b calls ! 465: and ! 466: .b ret ! 467: instructions are used for most subroutine calls and returns. ! 468: These instructions expect the stack to look a certain way. ! 469: .sh +0 Namestack ! 470: .pp ! 471: The namestack contains only lisp data. ! 472: It is used to pass arguments to functions and to hold ! 473: local (lexically scoped) data within lisp functions. ! 474: It is also used as a temporary storage spot for lisp data ! 475: which must be protected from garbage collection. ! 476: .pp ! 477: A slight digression on the garbage collector: ! 478: The person who writes code for the lisp system must always be aware of ! 479: his greatest enemy: the garbage collector. ! 480: Whenever a function is called that could possibly allocate more lisp ! 481: data, one must assume that it ! 482: when it attempts to allocate space, the garbage collector ! 483: will be invoked and that it will take away everything that isn't protected ! 484: from garbage collection. ! 485: The objects that are protected from garbage collection are: the ! 486: namestack, the bindstack, the oblist (table of interned symbols), ! 487: and the compiler literals. Objects that are only ! 488: referred to by values in registers or ! 489: the C runtime stack will not be seen by the mark phase of the ! 490: garbage collector and will be swept up during the sweep phase. ! 491: .pp ! 492: Back to the namestack: ! 493: The first free element on the namestack is pointed to by the ! 494: C variable ! 495: .i np . ! 496: This variable is always stored in the VAX register r6. ! 497: Another pointer into the namestack is the C variable ! 498: .i lbot . ! 499: It is always stored in VAX register r7. ! 500: Its use will be described in the section on calling conventions. ! 501: .sh +0 Bindstack ! 502: .pp ! 503: The bindstack is a stack of lisp object pairs: symbol and ! 504: value. ! 505: It is used to implement shallow binding. ! 506: When a symbol is lambda-bound, the symbol and its current value are put ! 507: on this stack. ! 508: Then the symbol can be given a new value. ! 509: When the context of the lambda binding is over, the symbol and value pair ! 510: are popped from the stack and the symbol is given its old value. ! 511: The C variable ! 512: .i bnp ! 513: points to the first free entry on the bindstack. ! 514: In the C code, the following macros lambda-bind a symbol to a value ! 515: and restore the old value: ! 516: .(b ! 517: #define PUSHDOWN(atom,value)\e ! 518: {bnp->atm=(atom);\e ! 519: bnp++->val=(atom)->a.clb;\e ! 520: (atom)->a.clb=value;\e ! 521: if(bnp>bnplim) binderr();} ! 522: ! 523: #define POP\e ! 524: {--bnp; bnp->atm->a.clb=bnp->val;} ! 525: .)b ! 526: .sh -1 Bitmap ! 527: .pp ! 528: The bitmap is used in garbage collection to hold ! 529: the mark bits for the mark and sweep garbage collector. ! 530: As its names implies, it is viewed as a collection of bits. ! 531: The minimum size of a lisp data object is 4 bytes, thus ! 532: there are 128 of those on a VAX page of 512 bytes. ! 533: Since there are 8 bits in a byte, it takes 16 bytes to obtain 128 bits. ! 534: Therefore the size of the bitmap in bytes is 16 ! 535: times the maximum number of ! 536: pages. ! 537: Like the typetable, the bitmap keeps track of marks from the ! 538: bottom of memory, not the bottom of data space. ! 539: The bitmap and the typetable are static structures. ! 540: It is their size, which is determined when the C kernel is built, ! 541: which determines the size to which the data segment can grow. ! 542: .sh +0 Transfer\ Tables ! 543: .pp ! 544: Transfer tables are used by compiled lisp code as a transfer vector to ! 545: other functions. ! 546: A transfer table consists of pairs of entries: symbol and pointer to ! 547: function. ! 548: When a compiled lisp function calls another (non-local) function, it ! 549: calls indirectly through an entry in the transfer table. ! 550: Depending on the setting of certain status variables, the call may ! 551: bring control into a function linkage routine or directly to ! 552: the function desired (if that function is a compiled lisp or C function). ! 553: .pp ! 554: Transfer tables serve a number of useful purposes. ! 555: .np ! 556: They allow compiled code to call interpreted code ! 557: or compiled code using the same calling sequence. ! 558: .np ! 559: They allow the selection of which function to call to be determined ! 560: at runtime based on the name of the function. ! 561: In most other languages, this selection process is done at either ! 562: compile or load time. ! 563: .np ! 564: They allow the user to run in a debugging mode where all calls ! 565: from compiled code go through the interpreter. ! 566: Once control reaches the interpreter, it is easier to debug. ! 567: .np ! 568: They allow the user to run in a production mode, where all calls ! 569: from compiled to other compiled code are done without function ! 570: lookup overhead. ! 571: .np ! 572: They allow the user to switch between debugging and production modes ! 573: at runtime with one function call. ! 574: .pp ! 575: Transfer tables will be described further in the section on ! 576: the lisp compiler. ! 577: .sh +0 Catch\ frames ! 578: .pp ! 579: Lisp has a number of forms of non-local transfers. ! 580: Among them are ! 581: .i throw , ! 582: .i error , ! 583: .i return ! 584: and ! 585: .i go . ! 586: If a program is willing to accept a non-local transfer, it puts a ! 587: .i catch ! 588: .i frame ! 589: on the stack which describes which type of transfer it ! 590: accepts. ! 591: The catch frame describes the current state of the registers, ! 592: the variables np, lbot, and bnp, and also contains entries that describe ! 593: what kinds of non-local transfers the function will accept. ! 594: After creating a catch frame, the program goes on to evaluate forms. ! 595: Should the evaluation of one of those forms result in a non-local ! 596: transfer to the catch frame that this program has set up, the ! 597: system will restore the namestack and bindstack to the way ! 598: they were when the program put the catch frame on the stack (by using ! 599: np and bnp) and will return control to the program (setting ! 600: the variable retval to describe why it returned). ! 601: This is described further in the ! 602: section on interpreter conventions. ! 603: .pp ! 604: The C variable ! 605: .i errp ! 606: points to the most recent catch frame, and then each catch frame points ! 607: to the previous catch frame. ! 608: .sh +0 oblist ! 609: .pp ! 610: Normally when symbols are created they are ! 611: .i interned , ! 612: that is they are put in a hash table called an oblist (or obarray). ! 613: The purpose of the oblist is to insure that if you type a ! 614: symbol's name to the reader, you will always get the same symbol. ! 615: Also it protects the symbol from garbage collection. ! 616: The oblist is simply a hash table with buckets, with a hash link being ! 617: part of the symbol data structure. ! 618: Currently the hash table is not accessible to a lisp program, but with ! 619: a little work it could be. ! 620: .+c Interpreter ! 621: .sh 0 _ ! 622: .nr $1 \n(ch ! 623: .pp ! 624: The interpreter is composed of these parts: ! 625: .ip \fIcore:\fP ! 626: The functions eval, apply and funcall. ! 627: .ip \fIstack\ management:\fP ! 628: The code to create catch frames and handle non-local transfers. ! 629: .ip \fImemory\ management:\fP ! 630: The code to allocate and garbage collect memory. ! 631: .ip \fIthe\ functions:\fP ! 632: The user callable functions that expect lisp arguments and return ! 633: lisp values. ! 634: .pp ! 635: In the above list, the first three are written mainly in C (with a little ! 636: assembler) and the last is written mainly in Lisp with a little ! 637: bit in C ! 638: and a very small amount in assembler. ! 639: .pp ! 640: The core functions are the center of activity in the interpreter. ! 641: The ! 642: .i eval ! 643: function given a lisp expression to evaluate. ! 644: It must determine if it is a simple expression such as a symbol or number ! 645: whose value eval can determine immediately, or if it is an function ! 646: calling expression. ! 647: If the form is a function call, eval must determine if the arguments should ! 648: be evaluated and if so eval must recursively call itself to evaluate the ! 649: arguments and then stack them. ! 650: If the function called is to be interpreted, eval must lambda-bind the ! 651: formal variables of the function to the arguments just stacked. ! 652: If the function being called is compiled, eval just calls the function and ! 653: lets the function do the lambda binding if it wants to. ! 654: .pp ! 655: The ! 656: .i apply ! 657: function is passed two lisp objects: ! 658: a function to call (either a symbol or a functional object) ! 659: and a list of arguments already evaluated. ! 660: It will do lambda binding if the function to call is to ! 661: be interpreted. ! 662: .pp ! 663: The ! 664: .i funcall ! 665: function is passed any number of lisp objects, ! 666: the first of which is the function to call, and the rest are the ! 667: arguments to the function ! 668: which have already been evaluated and stacked. ! 669: Funcall will do lambda binding if the function to call is ! 670: to be interpreted. ! 671: .pp ! 672: When compiled lisp code calls a function ! 673: which must be interpreted, it enters the interpreter through the ! 674: funcall function. ! 675: The interpreter may go to compiled code through either eval, apply or ! 676: funcall, though most often it goes through eval. ! 677: .sh 2 Conventions ! 678: .pp ! 679: These are the conventions used in interpreted and compiled code. ! 680: .sh +1 C\ conventions ! 681: .pp ! 682: Our conventions are extensions of the C calling conventions, so we ! 683: describe here the conventions for C. ! 684: The VAX has 16 general ! 685: purpose registers. ! 686: Registers r12 through r15 are reserved ! 687: for use by the hardware because we use the ! 688: .i calls ! 689: subroutine call. ! 690: Registers r0-r5 can be used by a program without saving them first. ! 691: The result of a function call is returned in r0. ! 692: Registers r11-r6 are allocated (from high to low) ! 693: by the C compiler when a ! 694: .i register ! 695: declaration is made in the C code. ! 696: Registers r11-r6 must be ! 697: saved upon entry and restored upon exit if they are used within a function. ! 698: On the VAX it is very easy to preserve registers ! 699: since it is done automatically ! 700: by the hardware if the ! 701: .i calls ! 702: (or ! 703: .i callg ) ! 704: instruction is used. ! 705: The first short word (two bytes) of a subroutine is a ! 706: register save mask which ! 707: tells which registers should be saved (on the C runtime stack) upon ! 708: entry and restored when a ! 709: .i ret ! 710: instruction is done. ! 711: .sh +0 np\ and\ lbot ! 712: .pp ! 713: Earlier we mentioned that the C variables np and lbot are ! 714: always stored in ! 715: registers r6 and r7. ! 716: It is very difficult to globally assign a variable to a register ! 717: in the C language (no external register declarations ! 718: are permitted) ! 719: and thus we have to filter ! 720: the output of the C compiler and convert all occurrences of 'np' to 'r6' ! 721: and all occurrences of 'lbot' to 'r7'. ! 722: This is only half the job, though. ! 723: We also must modify the register save masks for those routines which ! 724: alter the value of np or lbot to insure that those registers get ! 725: saved and restored upon function entry and exit. ! 726: This is done in the C code by putting ! 727: .(b C ! 728: Savestack(n) ! 729: .)b ! 730: at the beginning of a routine which will alter np or lbot and which ! 731: also allocates n register variables. ! 732: Also in that routine, before the function returns, we put ! 733: .(b C ! 734: Restorestack() ! 735: .)b ! 736: This is not really necessary in the VAX, but it is there for other ! 737: machine implementations which don't have a ! 738: .i ret ! 739: function which restores registers. ! 740: The calls to Savestack are recognized by a filter which processes ! 741: the output ! 742: of the C compiler and fixes up ! 743: the save masks. ! 744: .sh +0 Function\ calling ! 745: .pp ! 746: The following text describes what the conventions are for ! 747: calling ! 748: compiled lisp functions, whether they were written in lisp or C. ! 749: We look at it from the viewpoint of the called function. ! 750: .pp ! 751: Upon entry ! 752: to the compiled functon, ! 753: lbot points to the first argument and np points to the ! 754: first free spot on the namestack. ! 755: If np equals lbot then there are no arguments. ! 756: Recall that np will be in r6 and lbot in r7. ! 757: The function is free to alter registers r0 through r5 and should return ! 758: the result in r0. ! 759: The function may alter registers r6 through r11 as long as their ! 760: original values ! 761: are restored when the function returns. ! 762: The value of np should always ! 763: point to the first free entry in the namestack. ! 764: This is all that is required. ! 765: The extra conventions followed by ! 766: the lisp compiler ! 767: in the code it generates are described in the next chapter. ! 768: .sh +0 Catch\ frames ! 769: .pp ! 770: A catch frame saves the state of execution that a program ! 771: might want to return to at some later time. ! 772: A catch frame is stored on the C runtime stack, with the most recent ! 773: frame pointed to by the C global variable errp. ! 774: The information saved is ! 775: .ip \fIargs\fP ! 776: - one or two optional arguments. The lisp compiler always ! 777: stacks two ! 778: arguments since it must know exactly how large the frame is. ! 779: .ip \fIclass\fP ! 780: - the class describes what type of frame this is (described below). ! 781: .ip \fIretaddr\fP ! 782: - address to return to if returning from this frame ! 783: .ip \fIolderrp\fP ! 784: - pointer to next older catch frame on the stack ! 785: .ip \fIbnp\fP ! 786: - value of bnp (bindstack pointer) when the frame was created ! 787: .ip \fInp\fP ! 788: - value of np when the frame was created ! 789: .ip \fIlbot\fP ! 790: - value of lbot when the frame was created. ! 791: .ip \fIsystem\ dependent\fP ! 792: - the rest of the information stacked depends on the particular machine. ! 793: In the case of the VAX, registers r13 through r8 are stacked. (r14 and ! 794: r15 are the stack pointer and program counter; they are not saved since ! 795: they can be calculated from the other information). ! 796: .pp ! 797: The information in a catch frame is put on the C runtime stack ! 798: in the precise order given above, and the variable errp points not ! 799: at the beginning of the entire frame, but to the lbot entry. ! 800: Thus errp\ +\ 4 points to np. ! 801: The classes of frames are described below. ! 802: Each class is defined as a constant in the C code (h/frame.h) whose value ! 803: is a small integer. ! 804: .ip F_PROG\ [1] ! 805: - takes no arguments. This frame marks the beginning of a piece of ! 806: code which can accept a ! 807: .i return ! 808: or a ! 809: .i go . ! 810: The interpreter uses this to implement ! 811: .i prog ! 812: and ! 813: .i do ! 814: and for its primative break loop. ! 815: The lisp compiler does not use catch frames for progs since it only ! 816: permits ! 817: .i returns ! 818: and ! 819: .i gos ! 820: to occur within ! 821: .i progs ! 822: or ! 823: .i dos ! 824: and thus it can determine how to do the non-local transfer ! 825: at compile time. ! 826: .ip F_CATCH\ [2] ! 827: - this takes one or two arguments and is used to implement both ! 828: .i catch ! 829: and ! 830: .i errset . ! 831: In both cases ! 832: the first argument is the tag which is being caught, ! 833: which in the case of an ! 834: .i errset ! 835: is symbol ER%all. ! 836: An ! 837: .i errset ! 838: also ! 839: supplies a second argument which is non nil if the error message ! 840: should be printed. ! 841: .ip F_RESET\ [3] ! 842: - in the C kernel, the reset function ! 843: is implemented as a non local transfer to a F_RESET catchframe. ! 844: When the lisp system is built, the reset function is redefined to ! 845: do a ! 846: .i throw. ! 847: Thus this type of catch frame is rarely used. ! 848: .ip F_EVAL\ [4] ! 849: - this has one argument, the form being evaluated. ! 850: Since stacking this ! 851: on every eval would be expensive, ! 852: this type of catch frame is only stacked if a \fI(*rset\ t)\fP ! 853: has been done and if the value of the symbol ! 854: .i *rset ! 855: is non nil. ! 856: The form being evaluated is stacked so that ! 857: the necessary information for the ! 858: .i evalframe ! 859: function is available and so that ! 860: .i freturn ! 861: can return from the context of any pending evaluation. ! 862: .ip F_FUNCALL\ [5] ! 863: - this is much like F_EVAL, ! 864: except the one argument it takes is the ! 865: name of the function to call. ! 866: It has the same restrictions on when it is stacked as F_EVAL ! 867: and for the same reasons. ! 868: .pp ! 869: In C, a frame is pushed on the stack with Pushframe, with a call ! 870: of one of these forms: ! 871: .(b ! 872: errp = Pushframe(class); ! 873: errp = Pushframe(class,arg1); ! 874: errp = Pushframe(class,arg1,arg2); ! 875: .)b ! 876: After the call the C variable ! 877: .i retval ! 878: contains the value which describes why this function returned. ! 879: You must remember that it is possible for this one function call to ! 880: return more than once! ! 881: The reasons it can return are these (from h/frame.h): ! 882: .ip C_INITIAL\ [0] ! 883: This is the initial call to set up the frame. ! 884: .ip C_GO\ [1] ! 885: This will only happen for F_PROG frames. ! 886: In this case, ! 887: the C variable lispretval contains a lisp symbol which is the tag ! 888: to which control should be transferred. ! 889: If the tag cannot be found in this prog or do body, the ! 890: tag should be again thrown up the stack to a next higher prog or do. ! 891: .ip C_RET\ [2] ! 892: This will only happen for F_PROG frames. ! 893: In this case, lispretval contains the value to return from this prog. ! 894: .ip C_THROW\ [3] ! 895: This will only happen for F_CATCH frames. ! 896: In this case lispretval contains the value to return from this catch. ! 897: .ip C_RESET\ [4] ! 898: This will only happen for F_RESET frames. ! 899: It simply means that a reset is being done. ! 900: .ip C_FRETURN\ [5] ! 901: This will only happen for F_EVAL and F_FRETURN frames. ! 902: The variable lispretval contains the value to return from this ! 903: call to ! 904: .i eval ! 905: or ! 906: .i funcall . ! 907: .pp ! 908: The call to Pushframe is turned into a simple subroutine call (using ! 909: the ! 910: .i jsb ! 911: instruction instead of the ! 912: .i calls ) ! 913: by the filters which alter the code coming out of the C compiler. ! 914: Thus it may be useful to see what stacking a catch frame really looks ! 915: like. ! 916: Here is what the lisp compiler generates ! 917: to stack the frame for \fI(*catch\ 'tag\ x)\fP ! 918: .(b ! 919: movl 0(r8),r0 #move 'tag to r0 ! 920: pushl $0 # dummy argument ! 921: pushl r0 # put tag argument on C runtime stack ! 922: pushl $2 # push F_CATCH ! 923: jsb _qpushframe # call Pushframe ! 924: movl r0,_errp # move result into errp ! 925: .)b ! 926: .pp ! 927: Every function which does a Pushframe() must also do a Popframe() ! 928: before it returns to its caller. ! 929: This simply removes the frame from the stack. ! 930: In C it is written: ! 931: .br ! 932: .tl ''errp = Popframe()'' ! 933: in the code generated by the lisp compiler it looks like: ! 934: .(b ! 935: movl _errp,r1 # put current errp in r1 ! 936: movl 12(r1),_errp # put previous errp in errp ! 937: addl3 $32,r1,sp # pop frame from stack ! 938: .)b ! 939: .pp ! 940: Non-local transfers are done with the Inonlocalgo ! 941: function. This function always takes three arguments. ! 942: The ! 943: first is the return type (one of the types mentioned ! 944: above which begin with 'C_'). It will be assigned to retval. ! 945: The second argument is the value to be assigned to lispretval, ! 946: except in the case of a C_THROW, where the second argument ! 947: is the tag to throw to and the third argument is the value ! 948: to assign to lispretval. ! 949: This function never returns. ! 950: If it doesn't find a catch frame ! 951: which matches what it is looking for, it signals an error. ! 952: The function is called with ! 953: .i calls ! 954: and the arguments are stacked on the C runtime stack, not the ! 955: namestack. ! 956: .+c Liszt:\ The\ Lisp\ Compiler ! 957: .sh 0 _ ! 958: .nr $1 \n(ch ! 959: .pp ! 960: The purpose of compiling a lisp function is to create a representation of ! 961: the function which can be evaluated in less time and perhaps take ! 962: up less space. ! 963: There are two approaches to lisp compilers. ! 964: One is to convert the functions to a compact form, often called ! 965: .i bytecodes ! 966: which can be rapidly interpreted. ! 967: Each bytecode represents a primitive operation in the lisp system. ! 968: This approach is simple to implement but there is an time penalty ! 969: in using an interpreter. ! 970: The other approach is to compiled to machine code. ! 971: In general, this is not as portable as the bytecode approach but the result ! 972: generally runs faster. ! 973: There are two ways of compiling to machine code. ! 974: One is to place arguments to functions in registers. ! 975: If there are more arguments than registers, the arguments are put on ! 976: a stack and a special type of call is made. ! 977: This method is used in Maclisp. ! 978: The other method is to assume a stack model, in which ! 979: arguments to a function are placed on a stack. ! 980: This is what the ! 981: .Fr ! 982: compiler Liszt does. ! 983: The stack model made it much easier to write parts of the lisp ! 984: system in the C langauge. ! 985: It also simplifies the garbage collector since the mark phase doesn't ! 986: have to locate and peruse all saved registers looking for lisp data. ! 987: .sh 2 File\ Compiler ! 988: .pp ! 989: When a file of lisp expressions is loaded, ! 990: the ! 991: .i load ! 992: function repeatedly reads and evaluates the forms in the ! 993: file. ! 994: Some of these evaluations may result in functions being defined, ! 995: and others may use the functions just defined or previously defined. ! 996: The job of the lisp compiler is to create an object file, which, ! 997: when read in ! 998: by ! 999: .i fasl, ! 1000: acts just like it was ! 1001: .i load ed ! 1002: in, except when a function is defined it is defined in machine ! 1003: code, not as a lisp expression. ! 1004: This is quite a bit different from what compilers do in other languages ! 1005: and it is done this way to make it easier to ! 1006: switch between compiled and ! 1007: interpreted code. ! 1008: .sh +0 Differences\ between\ compiled\ and\ interpreted ! 1009: .pp ! 1010: There are some differences, though, between compiled and interpreted code. ! 1011: Local variables in compiled code are lexically scoped (put on the ! 1012: namestack and inaccessible to other functions) unless the variable ! 1013: has been declared ! 1014: .i special. ! 1015: The declaration: ! 1016: .(b ! 1017: \fI(declare (special x y))\fP ! 1018: .)b ! 1019: declares both x and y to be special from this point in the file on. ! 1020: The declaration ! 1021: .(b ! 1022: \fI(declare (specials t))\fP ! 1023: .)b ! 1024: declares all variables to be special. ! 1025: .pp ! 1026: Lisp has a very powerful macro definition system. ! 1027: The compiler will macro expand all it can, whereas the interpreter ! 1028: expands macros when necessary but never replaces a macro call with ! 1029: its expansion. ! 1030: Thus if you redefine a macro, the compiled code that uses it will ! 1031: still work the same way, but the interpreted code will use the ! 1032: new definition. ! 1033: Also, when compiling a file, macro definitions must be ! 1034: done before any call on the macro is encountered during compiling. ! 1035: In the interpreter, ! 1036: macros can be defined or redefined anytime up until ! 1037: they are used. ! 1038: Thus the interpreter is far freer about macro definitions than ! 1039: the compiler. ! 1040: This could cause programs which work interpreted to ! 1041: fail compiled. ! 1042: .sh +0 Top\ level\ algorithm ! 1043: .pp ! 1044: The top level algorithm of the compiler is simply this: ! 1045: .np ! 1046: read a lisp expression from the input file ! 1047: .np ! 1048: macro expand the top level of the form as much as possible ! 1049: .np ! 1050: if the form is a function definition (a list beginning with ! 1051: the symbol 'def') ! 1052: then compile it. ! 1053: .np ! 1054: if the form is not a function definition then put it on a list of ! 1055: forms that will be evaluated when the file is ! 1056: .i fasl ed ! 1057: in. ! 1058: .np ! 1059: if not at end of file, go to step 1. ! 1060: .pp ! 1061: In reality, step 3 is a bit more complex. ! 1062: If the definition is of a macro, then the form will be evaluated ! 1063: immediately, thus adding the macro definition to the compiler's ! 1064: environment. ! 1065: If the lisp variable ! 1066: .i macros ! 1067: is t then the macro will also be compiled. ! 1068: There are also some forms like \fI(progn\ 'compile\ ...)\fP ! 1069: and \fI(eval-when\ ()\ )\fP which determine when the ! 1070: forms get compiled and evaluated. ! 1071: See the lisp manual for details. ! 1072: .sh +0 Expression\ Compilation ! 1073: .pp ! 1074: Just as the interpreter is centered around the ! 1075: .i eval ! 1076: function, the lisp compiler is centered around the function ! 1077: .i d-exp ! 1078: whose job it is to compile the expression passed to it. ! 1079: The lisp compiler is one pass. ! 1080: Before a call to d-exp returns, ! 1081: all the compiled code for a form has been computed and written ! 1082: to a file. ! 1083: One reason that the lisp compiler can be a single pass compiler ! 1084: is that the assembler which reads the compiler's output is ! 1085: a two pass assembler. ! 1086: .sh +1 global\ state\ variables ! 1087: .pp ! 1088: There are a number of variables that maintain the state of ! 1089: the compilation process. ! 1090: These variables are declared special and are thus dynamically scoped ! 1091: and visible to any function within the compiler. ! 1092: When d-exp ! 1093: is called their meanings are: ! 1094: .ip \fIv-form\fP ! 1095: - contains the expression to be compiled. ! 1096: .ip \fIg-loc\fP ! 1097: - tells where the result of evaluating this expression should be put. ! 1098: If g-loc is nil, then the value returned is unimportant and shouldn't ! 1099: be put anywhere. ! 1100: .ip \fIg-cc\fP ! 1101: - controls conditional branches. ! 1102: If g-cc is non nil, then it is a list cell whose value has ! 1103: either a non-null car or non-null cdr but not both. ! 1104: If the car is non-nil then ! 1105: if the value of the expression held in ! 1106: v-form is non-nil, a branch should be done to the symbol ! 1107: stored in \fI(car\ g-cc)\fP. ! 1108: If the cdr is non-nil then if the value of v-form ! 1109: is nil, a branch should be done to the symbol stored in ! 1110: \fI(cdr\ g-cc)\fP. Since at ! 1111: every conditional ! 1112: branch control should either jump or continue, the car and cdr will ! 1113: never both be specified. ! 1114: .ip \fIg-ret\fP ! 1115: - is non nil if the result of evaluating v-form will be returned as the ! 1116: value of the function we are evaluating. ! 1117: This is used to allow liszt to detect ! 1118: when tail recursion removal is possible. ! 1119: .ip \fIg-locs\fP ! 1120: - maintains current information about the state of the stacks: ! 1121: the bindstack (for specials), the namestack (for locals) and the ! 1122: C runtime stack (for catch frames) ! 1123: The form of g-locs is a stack of tokens stored as a list. ! 1124: The meaning of each token is as follows: ! 1125: .br ! 1126: \fInil\fP - this represents an unnamed object on the namestack. This happens ! 1127: when calling functions, when the arguments are stacked prior ! 1128: to a function call. ! 1129: .br ! 1130: \fI<symbol>\fP - the given symbol is the name of a local variable on the namestack. ! 1131: .br ! 1132: \fI(prog . <n>)\fP - prog statement which binds <n> special variables ! 1133: .br ! 1134: \fI(do . <n>)\fP - head of a do statement which binds <n> special variables ! 1135: .br ! 1136: \fI(catcherrset . 0)\fP - catch or errset frame on stack ! 1137: .br ! 1138: \fI(lambda . <n>)\fP - lambda expression which binds <n> special variables ! 1139: .ip \fIg-labs\fP ! 1140: - this is a stack of labels. ! 1141: There is one entry in g-labs for every entry which is a list ! 1142: in g-locs. ! 1143: the forms of the entries are: ! 1144: .br ! 1145: \fInil\fP - no labels in this form ! 1146: .br ! 1147: \fI(endlab (real . gen) (real2 . gen2) ...)\fP - endlab is the label to go to ! 1148: to get out of this body. After this label will be code ! 1149: to unbind specials and pop off locals. ! 1150: The 'real' labels are those found in the code. the gen ! 1151: labels are those generated and put in the assembler code. ! 1152: .sh +0 Function\ compilation ! 1153: .pp ! 1154: The action d-exp takes when compiling an expression depends on the ! 1155: type of expression. ! 1156: For atoms (symbols and numbers) the compilation is very simple, it ! 1157: is just a matter of putting the value where g-loc specifies and ! 1158: then jumping if specified by g-cc. ! 1159: When the expression is a list, d-exp first macro ! 1160: expands the form and then looks at the first element of ! 1161: the list (if the list has not macro expanded to an atom). ! 1162: If the first element is a symbol then the list is ! 1163: is a function call. ! 1164: If the function is one of the functions which liszt knows how to open compile ! 1165: then liszt will call the particular routine designated to open ! 1166: compile this function. ! 1167: There are two classes of functions within liszt that ! 1168: do open compiling. ! 1169: The first class, the fl-expr class, are distinguished by names which ! 1170: begin with c-. ! 1171: These functions always generate ! 1172: code which returns the result in r0. ! 1173: They are not equipped to obey the contents of g-loc and g-cc. ! 1174: Thus d-exp, after calling one of these functions, must do what ! 1175: is necessary to obey g-loc and g-cc. ! 1176: The other class of open compiling functions, the fl-exprcc class (whose ! 1177: names begin with cc-), ! 1178: handle g-loc and g-cc. ! 1179: As a result these are usually much more complex and generate better code. ! 1180: .sh -1 Register\ Conventions ! 1181: .pp ! 1182: The register conventions used by liszt are an extension of those ! 1183: used by the C code. ! 1184: .ip \fIr0\fP ! 1185: - return value placed here ! 1186: .ip \fIr1,r2,r3,r4\fP ! 1187: - scratch registers. ! 1188: When long strings of ! 1189: .i car's ! 1190: or ! 1191: .i cdr's ! 1192: are done (such as ! 1193: .i caddadr ) ! 1194: these registers are used in a least-recently-used fashion to access down ! 1195: the list. ! 1196: The compiler keeps track of the values in these registers but must ! 1197: assume that they are garbage after a function is called. ! 1198: .ip \fIr5\fP ! 1199: - fixnum accumulator. ! 1200: There a number of functions which work on fixnum's only and these ! 1201: usually put their result in r5. ! 1202: The assembler code function ! 1203: .i qnewint ! 1204: which returns a pointer to a cell containing a fixnum value (after ! 1205: checking if it is in the fixnum table), expects its argument to be ! 1206: in r5. ! 1207: .ip \fIr6\fP ! 1208: np ! 1209: .ip \fIr7\fP ! 1210: lbot. ! 1211: When calling a function, this register is set just before the function ! 1212: call, and after the function call its value is used to reset the value ! 1213: of np in order to pop the arguments off the namestack. ! 1214: .ip \fIr8\fP ! 1215: the literal table pointer. ! 1216: The compiler generates a table of all the literal lisp data ! 1217: which the compiled code might access. ! 1218: Upon function entry a pointer to the base of this table is put in r8. ! 1219: For example, if we compile \fI(setq\ x\ 'foo)\fP then we will ! 1220: need a pointer to the lisp symbol ! 1221: .i foo ! 1222: and if the symbol ! 1223: .i x ! 1224: as been declared special we will also need a pointer to ! 1225: .i x . ! 1226: .ip \fIr10\fP ! 1227: - upon function entry the value of lbot (r7) is put in r10. ! 1228: This allows us to reference the arguments to our function while ! 1229: using lbot to call other function. ! 1230: .sh +0 Addresses ! 1231: There are four types of addresses used internally in Franz: ! 1232: symbolic, intermediate addresses (iadr), extended intermediate (eiadr) ! 1233: and vax assembler format. ! 1234: .sh +1 Symbolic ! 1235: .pp ! 1236: These are the names of lisp objects, such as `a', `foo', 34, ! 1237: or 3.234234. ! 1238: A name is either a local variable, a special variable or a ! 1239: number. A number is either a small fixnum, large fixnum, ! 1240: bignum or floating point number. ! 1241: .sh +0 Intermediate\ address\ (IADR) ! 1242: .pp ! 1243: This type of address is generated from a symbolic address by ! 1244: .i d-loc, ! 1245: .i d-loclit ! 1246: and the routines ! 1247: .i d-simple ! 1248: and ! 1249: .i d-rsimple ! 1250: which ! 1251: call them. The forms are ! 1252: .ip \fINil\fP ! 1253: - the location or value of nil. ! 1254: .ip \fIT\fP ! 1255: - the location or value of t. ! 1256: .ip \fIreg\fP ! 1257: - register r0, which is where the result of function calls ! 1258: are stored. ! 1259: .ip \fIstack\fP ! 1260: - the address pointed to by np with np incremented after ! 1261: the value is stored. (i.e (r6)+) ! 1262: .ip \fIunstack\fP ! 1263: - the address one word below np (np is decremented before ! 1264: accessing. (i.e. -(r6)) ! 1265: .ip \fI<symbol>\fP ! 1266: - this is just <symbol>. This allows strange forms to be ! 1267: represented directly. ! 1268: .ip \fI(stack\ n)\fP ! 1269: - The n'th value on the namestack for this function. ! 1270: The first value 0(r10) is (stack 1). ! 1271: .ip \fI(vstack\ n)\fP ! 1272: - The cdr of the n'th value on the namestack. ! 1273: That is, (stack 1) is *0(r10) ! 1274: .ip \fI(bind\ n)\fP ! 1275: - The value of the n'th value in ! 1276: the literal table. If ! 1277: this refers to a symbol, then this is the value ! 1278: of the symbol. ! 1279: If this refers to a list then this ! 1280: this is the cdr of the list (although this is a rare ! 1281: use). (bind 1) corresponds to *0(r8). ! 1282: .ip \fI(lbind\ n)\fP ! 1283: - The n'th value in the literal table. If ! 1284: this refers to ! 1285: object foo then this is the address of foo ! 1286: in memory. ! 1287: .ip \fI(fixnum\ n)\fP ! 1288: - This is the address of small fixnum n in memory. ! 1289: A small fixnum is in the range -1024 to 1023. ! 1290: .ip \fI(immed\ n)\fP ! 1291: - The is the immediate constant n. ! 1292: .sh +0 extended\ intermediate\ address\ (EIADR) ! 1293: .pp ! 1294: This address is generated from an IADR by e-cvt. It ! 1295: symbolically represents a vax address. ! 1296: .ip \fI<atom>\fP ! 1297: - This corresponds to <atom>. ! 1298: .ip \fI(n\ <atom>)\fP ! 1299: - This corresponds to n(<atom>) ! 1300: (stack n+1) and (lbind n+1) are converted to these forms. ! 1301: .ip \fI(n\ <atom1>\ <atom2>)\fP ! 1302: - This corresponds to n(<atom1>)[<atom2>] ! 1303: There is currently no IADR which generates this. ! 1304: .ip \fI(*\ n\ <atom>)\fP ! 1305: -This corresponds to *n(<atom>) ! 1306: (vstack n+1) and (bind n+1) are converted to these forms. ! 1307: .ip \fI(+\ <atom>)\fP ! 1308: - This corresponds to (<atom>)+. ! 1309: stack is converted to this form. ! 1310: .ip \fI(-\ <atom>)\fP ! 1311: - This corresponds to -(<atom>) ! 1312: unstack is converted to this form. ! 1313: .ip \fI($\ <numb>)\fP ! 1314: - This corresponds to $<atom> ! 1315: (immed <numb>) is converted to this form. ! 1316: .ip \fI(#\ <numb>)\fP ! 1317: - This corresponds to $<loc> where <loc> equals ! 1318: the base of the fixnums (0x1400) plus 4 * <numb> ! 1319: (fixnum <numb>) is converted to this form ! 1320: .ip \fI(*\ #\ <numb>)\fP ! 1321: - This corresponds to $<numb>. It is generated ! 1322: by d-rsimple occasionally when you ask for the ! 1323: cdr of a number (which you do sometimes when you ! 1324: are compiling fixnum operators). ! 1325: .sh +0 Vax\ Unix\ assembler\ addresses ! 1326: .pp ! 1327: The addresses are printed from a EIADR by e-cvtas. The conversions ! 1328: are shown in the above section. ! 1329: .sh -1 Function\ calling\ convention ! 1330: .sh +1 Function\ linkages ! 1331: .pp ! 1332: The function associated with a symbol is stored in the function ! 1333: definition slot of the symbol. If the function slot contains a ! 1334: list then the function is to be interpreted and its 'car' will ! 1335: be lambda, nlambda, lexpr or macro. If the function is compiled then ! 1336: the function definition slot will contain a binary object. ! 1337: A binary object ! 1338: is composed of two independent parts: the discipline and the entry. ! 1339: The discipline is one of: ! 1340: .ip \fIlambda\fP ! 1341: - a function whose arguments should be evaluated. ! 1342: .ip \fInlambda\fP ! 1343: - a function whose arguments should not be evaluated but ! 1344: which should be passed as a list ! 1345: .ip \fImacro\fP ! 1346: - a function which should be passed the unevaluated form ! 1347: being evaluated and whose result should be evaluated. ! 1348: .ip \fI\"subroutine\"\fP ! 1349: - a foreign function subroutine ! 1350: .ip \fI\"integer-function\"\fP ! 1351: - a foreign function returning an integer ! 1352: .ip \fI\"real-function"\fP ! 1353: - a foreign function returning a flonum. ! 1354: .pp ! 1355: A lexpr is not in the list as there is no difference to the caller ! 1356: between compiled lambda's and compiled lexprs. ! 1357: .sh +0 Transfer\ tables ! 1358: Most calls from compiled code to other lisp functions go through ! 1359: transfer tables. A transfer table entry is a pair: symbol, address ! 1360: of routine. ! 1361: When another lisp function is called it uses the ! 1362: .i calls ! 1363: instruction which will ! 1364: indirectly jump through the address in the transfer table. This ! 1365: address may point to the desired function or it may point to the ! 1366: interface routine. If a call ends up in the interface routine, ! 1367: then that routine will trace back through the call stack and eventually ! 1368: reach the tranfer table entry that it was 'called through'. Since the ! 1369: transfer table entry contains a symbol which names the function ! 1370: to be called, the interface routine can determine which routine ! 1371: was to have been called. If that routine is compiled then the ! 1372: interface routine can modify the transfer table so that a call ! 1373: through that table entry will go directly to the desired function. ! 1374: If the routine to be called is interpreted, or if the user has ! 1375: requested that transfer linkages should be disabled, then the ! 1376: interface routine will go through the 'funcall' function ! 1377: in the interpreter to continue the call. ! 1378: .sh +0 calling\ sequence\ in\ the\ compiler: ! 1379: .pp ! 1380: When transfer tables are used ! 1381: .(b ! 1382: \fBforeach\fP arg ! 1383: \fBcompute\fP arg and stack result using (r6)+ ! 1384: for example: movl r0,(r6)+ ! 1385: movab -n(r6),r7 where n = 4*number of args to fcn ! 1386: calls $0,*trantb+m where m is the index of the function ! 1387: in the transfer table. ! 1388: movl r7,r6 restore r6 ! 1389: .)b ! 1390: .pp ! 1391: The compiler supports local functions, which are function accessible ! 1392: only within one file. ! 1393: Because they are not accessible from C code, we can use a very fast ! 1394: call and return sequence when calling them. ! 1395: To call a local function ! 1396: .(b ! 1397: \fBforeach\fP arg ! 1398: \fBcompute\fP and stack using (r6)+ ! 1399: jsb LOCALNAME go directly to the function, it will ! 1400: restore r6 before it returns. ! 1401: .)b ! 1402: .pp ! 1403: The beginning of each function looks as follows. ! 1404: First for a non-lexpr function called in the standard way ! 1405: (topsim is the label jumped to when tail merging, it will be unique ! 1406: for each function; the brackets indicate the optional code which ! 1407: exists if the -p switch is given to liszt); ! 1408: .(b ! 1409: .word 0x5c0 # save r6, r7, r8, r10 ! 1410: [ movab mcounts,r0 # if profiling, mcounts replaced by fasl ! 1411: jsb mcount ] ! 1412: movab linker,r8 # set up r8 ! 1413: movl r7,r10 # set up oldlbot ! 1414: movab n(r10),r6 # n = 4*Number of args expected. ! 1415: topsim: ! 1416: .)b ! 1417: .pp ! 1418: Now for lexprs: ! 1419: .(b ! 1420: .word 0x5c0 ! 1421: [ movab mcounts,r0 # if profiling. [mcounts replaced by fasl] ! 1422: jsb mcount ] ! 1423: movab linker,r8 # set up r8 ! 1424: subl3 $4,r7,-(sp) # address one word below bottom of args ! 1425: topsim: ! 1426: movl r6,r10 # first free addr to arg base ! 1427: subl3 r7,r6,r0 # number of args * 4 into r0 ! 1428: movab 0x1400(r0),(r6)+ # stack boxed number of args ! 1429: movl 0(r10),-(sp) # also store on stack so user can't clobber ! 1430: .)b ! 1431: .pp ! 1432: And finally for local functions: ! 1433: .(b ! 1434: movl r10,-(sp) # save r10 ! 1435: movab -n(r6),r10 # set up arg base using arg top ! 1436: topsim: ! 1437: .)b ! 1438: .sh -1 Assembler\ file\ format ! 1439: .pp ! 1440: The liszt compiler generates a file which is in Unix assembler format. ! 1441: The Unix assembler converts that file into an object file which fasl ! 1442: then reads. ! 1443: Although the object file generated is a standard Unix object ! 1444: file (as defined in /usr/include/a.out.h), it is not of a ! 1445: format that the Unix ld loader can understand. ! 1446: This is because the requirements of a lisp object file are different ! 1447: from an object file of other languages. ! 1448: The ! 1449: run time semantics of lisp and the fact that lisp data must be protected ! 1450: from garbage collection are the principal differences. ! 1451: The unconventional object file created by the unix assembler ! 1452: is a result of the ! 1453: unconventional assembler input file. ! 1454: Next we will look at what must be put in the assembler file and how ! 1455: it is put there. ! 1456: .pp ! 1457: The assembler file must contain ! 1458: .ip \fIinstructions\fP ! 1459: - vax assembler code for the compiled functions. ! 1460: If there aren't any functions compiled, this can be empty. ! 1461: .ip \fIliterals\fP ! 1462: - lisp data which is referred to by compiled code ! 1463: .ip \fItransfer\ table\fP ! 1464: - the names of the functions which correspond to the calls through ! 1465: the transfer table. ! 1466: The instructions simply say 'call indirect through the nth transfer ! 1467: table entry'. ! 1468: .ip \fIfunction\ names\fP ! 1469: - the names of the functions which are being ! 1470: defined. ! 1471: .ip \fIload\ time\ forms\fP ! 1472: - in order to mimic the ! 1473: .i load ! 1474: function, fasl has to be able to evaluate lisp expressions at fasl ! 1475: time, so we must be able to store lisp expressions in the assembler ! 1476: file and indicate when they should be evaluated. ! 1477: .pp ! 1478: Based on the requirements above, the form of the assembler file ! 1479: is as described below. ! 1480: The assembler builds two segments: text and data. ! 1481: We put all of our information in the text segment. ! 1482: The compiler places some annotation strings in the data segment ! 1483: so that the object file can be identified, however the data segment ! 1484: is ignored by fasl. ! 1485: The format is ! 1486: .ip \fIcompiled\ instructions\fP ! 1487: The instructions for each compiled (non-local) function begins with ! 1488: .(b ! 1489: .globl F00003 ! 1490: F00003: ! 1491: .word 0x5c0 ! 1492: .)b ! 1493: The globl declaration and the fact that the symbol name begins ! 1494: with a F will tell fasl that this is the beginning of a lisp ! 1495: function. ! 1496: The symbols beginning with F must be unique within a file ! 1497: but may be duplicated in other files. ! 1498: The lisp name of the function will appear later. ! 1499: Next the instructions for the function are given. ! 1500: Only a small fixed set of external symbols may be referenced. ! 1501: The list is found in the code for nfasl.c and the common ! 1502: ones will be described below (soon). ! 1503: Labels should be given a name beginning with L and should ! 1504: be unique within the file. ! 1505: .ip \fItable\ sizes\fP ! 1506: somewhere in the file there should be a pair of 'set' assembler ! 1507: pseudo ops which describe the sizes of the literal table and ! 1508: transfer table. ! 1509: The form is this ! 1510: .(b ! 1511: .set linker_size 4 ! 1512: .set trans_size 3 ! 1513: .)b ! 1514: where linker_size is the number of entries in the literal table ! 1515: which will be required and trans_size is the number of entries ! 1516: in the transfer table which will be required. ! 1517: Those tables will be built by fasl. ! 1518: .ip \fIbinder\ table\fP ! 1519: - this table describes the order that the functions should be ! 1520: defined and forms evaluated. ! 1521: It is a table of longwords beginning at the symbol bind_org ! 1522: and ending when a -1 entry is seen. ! 1523: The meaning of the entries will be described below. ! 1524: .ip \fIlisp\ expression\ table\fP ! 1525: - this is a table of null terminated strings beginning at the ! 1526: symbol lit_org and ending at the symbol lit_end. ! 1527: Each string is read by the lisp read function (using the ! 1528: raw readtable). ! 1529: The first linker_size expressions are put in the literal table. ! 1530: The next trans_size expressions are the names of the ! 1531: functions to put in the transfer table. ! 1532: The remaining expressions correspond to the entries in the ! 1533: binder table. ! 1534: The binder entries are processed, one by one. ! 1535: Provided that the binder entry is not -1, an expression ! 1536: is read from the lisp expression table. ! 1537: Then if the binder table entry is 0, that expression is the name of ! 1538: a lambda type lisp function. ! 1539: A binary object is created, the discipline is set to lambda and ! 1540: the function address is set to the lexically next function ! 1541: defined in the ! 1542: file. ! 1543: If the binder entry is 1 then this is an nlambda function, and ! 1544: if the entry is 2 then this is a macro function. ! 1545: If the entry is 99 then the expression just read should be evaluated ! 1546: by the lisp function eval. ! 1547: .pp ! 1548: The lisp compiler normally puts the assembler format output file in /tmp ! 1549: and removes it when it is done. ! 1550: The -S switch will tell liszt to write the ! 1551: assembler file in the same place as ! 1552: the source file, and with the same root name but a .s extension. ! 1553: The -C file will tell the lisp compiler to comment the file as it ! 1554: generates it, making it easier to understand what is going on. ! 1555: Assume that the following is file foo.l: ! 1556: .(b ! 1557: (defun foo (x) (bar y) (baz k)) ! 1558: (print (foo 3)) ! 1559: (def test (nlambda (l) (print 'hithere) (foo 3))) ! 1560: .)b ! 1561: we now compile it with -SC ! 1562: .(b ! 1563: .globl F00007 #(fcn lambda foo) ! 1564: F00007: ! 1565: .word 0x5c0 ! 1566: movab linker,r8 ! 1567: movl r7,r10 ! 1568: movab 4(r10),r6 ! 1569: L00008: ! 1570: movl *0(r8),(r6)+ #(calling bar) ! 1571: #(from y to stack) ! 1572: movab -4(r6),r7 ! 1573: calls $0,*trantb+0 ! 1574: movl r7,r6 ! 1575: movl *4(r8),(r6)+ #(calling baz) ! 1576: #(from k to stack) ! 1577: movab -4(r6),r7 ! 1578: calls $0,*trantb+8 ! 1579: movl r7,r6 ! 1580: ret ! 1581: .globl F00009 #(fcn nlambda test) ! 1582: F00009: ! 1583: .word 0x5c0 ! 1584: movab linker,r8 ! 1585: movl r7,r10 ! 1586: movab 4(r10),r6 ! 1587: L00010: ! 1588: movl 8(r8),(r6)+ #(calling print) ! 1589: #(from 'hithere to stack) ! 1590: movab -4(r6),r7 ! 1591: calls $0,*trantb+16 ! 1592: movl r7,r6 ! 1593: movl $5132,(r6)+ #(calling foo) ! 1594: #(from (fixnum 3) to stack) ! 1595: movab -4(r6),r7 ! 1596: calls $0,*trantb+24 ! 1597: movl r7,r6 ! 1598: ret ! 1599: bind_org: ! 1600: .set linker_size, 3 ! 1601: .set trans_size, 4 ! 1602: .long 0 ! 1603: .long 99 ! 1604: .long 1 ! 1605: .long -1 ! 1606: lit_org: ! 1607: .asciz "y" ! 1608: .asciz "k" ! 1609: .asciz "hithere" ! 1610: .asciz "bar" ! 1611: .asciz "baz" ! 1612: .asciz "print" ! 1613: .asciz "foo" ! 1614: .asciz "foo" ! 1615: .asciz "(print (foo 3))" ! 1616: .asciz "test" ! 1617: lit_end: ! 1618: .data # this is just for documentation ! 1619: .asciz "@(#)Compiled by Lisp Compiler 7.1 on Sun Feb 21 17:51:54 1982" ! 1620: .asciz "@(#)decl.l 1.5 2/10/82" ! 1621: .asciz "@(#)array.l 1.1 9/25/81" ! 1622: .asciz "@(#)datab.l 1.1 9/25/81" ! 1623: .asciz "@(#)expr.l 1.1 9/25/81" ! 1624: .asciz "@(#)io.l 1.1 9/25/81" ! 1625: .asciz "@(#)funa.l 1.3 2/10/82" ! 1626: .asciz "@(#)funb.l 1.2 2/10/82" ! 1627: .asciz "@(#)func.l 1.2 2/10/82" ! 1628: .asciz "@(#)tlev.l 1.4 10/24/81" ! 1629: .asciz "@(#)fixnum.l 1.6 10/21/81" ! 1630: .asciz "@(#)util.l 1.2 10/7/81" ! 1631: .)b ! 1632: .sh +0 functions\ callable\ from\ compiled\ lisp\ code ! 1633: .sh +0 Object\ File\ Format ! 1634: .sh +0 Fasl ! 1635: ! 1636: ! 1637: ! 1638: ! 1639:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.