|
|
1.1 root 1: .\" Copyright (c) 1980 Regents of the University of California.
2: .\" All rights reserved. The Berkeley software License Agreement
3: .\" specifies the terms and conditions for redistribution.
4: .\"
5: .\" @(#)ch12.n 6.1 (Berkeley) 4/29/86
6: .\"
7: ." $Header: ch12.n 1.2 83/07/23 12:41:32 layer Exp $
8: .Lc Liszt\ -\ the\ lisp\ compiler 12
9: .sh 2 "General strategy of the compiler" \n(ch 1
10: .pp
11: The purpose of the lisp compiler, Liszt, is to create an object module which
12: when brought into the lisp system using
13: .i fasl
14: will have the same effect as bringing in the corresponding lisp coded source
15: module with
16: .i load
17: with one important exception,
18: functions will be defined as sequences of machine language instructions, instead
19: of lisp S-expressions.
20: Liszt is not a function compiler, it is a
21: .i file
22: compiler.
23: Such a file can contain more than function definitions; it can
24: contain other lisp S-expressions which are evaluated
25: at load time.
26: These other S-expressions will also be stored in the object
27: module produced by Liszt and will be evaluated at fasl time.
28: .pp
29: As is almost universally true of Lisp compilers, the main pass of Liszt
30: is written in Lisp.
31: A subsequent pass is the assembler, for which we use the
32: standard UNIX assembler.
33: .sh 2 "Running the compiler"
34: .pp
35: The compiler is normally run in this manner:
36: .br
37: % \fBliszt foo\fP
38: .br
39: will compile the file foo.l or foo (the preferred way to indicate a lisp
40: source file is to end the file name with `.l').
41: The result of the compilation will be placed in the file foo.o if no
42: fatal errors were detected.
43: All messages which Liszt generates go to the standard output.
44: Normally each function name is printed before it is compiled (the \-q
45: option suppresses this).
46: .sh 2 "Special forms"
47: .pp
48: Liszt makes one pass over the source file.
49: It processes each form in this way:
50: .sh 3 macro\ expansion
51: .pp
52: If the form is a macro invocation (i.e it is a list whose car is a symbol
53: whose function binding is a macro), then that macro invocation is expanded.
54: This is repeated until the top level form is not a macro invocation.
55: When Liszt begins, there are already some macros defined, in fact some
56: functions (such as defun) are actually macros.
57: The user may define his own macros as well.
58: For a macro to be used it must be defined in the Lisp system
59: in which Liszt runs.
60: .sh +0 classification
61: .pp
62: After all macro expansion is done, the form is classified according to its
63: .i car
64: (if the form is not a list, then it is classified as an
65: .i other ).
66: .sh +1 "eval-when"
67: .pp
68: The form of eval-when is
69: \fI(eval-when\ (time1\ time2\ ...)\ form1\ form2\ ...)\fP
70: where the time\fIi\fP are one of
71: .i eval ,
72: .i compile ,
73: or
74: .i load .
75: The compiler examines the form\fIi\fP in sequence and the action taken
76: depends on what is in the time list.
77: If
78: .i compile
79: is in the list then the compiler will invoke
80: .i eval
81: on each form\fIi\fP as it examines it.
82: If
83: .i load
84: is in the list then the compile will recursively call itself to compile
85: each form\fIi\fP as it examines it.
86: Note that if
87: .i compile
88: and
89: .i load
90: are in the time list, then the compiler will both evaluate and compile
91: each form.
92: This is useful if you need a function to be defined in the compiler
93: at both compile time (perhaps to aid macro expansion) and at run time
94: (after the file is
95: .i fasl ed
96: in).
97: .sh +0 "declare"
98: .pp
99: Declare is used to provide information about functions and variables to
100: the compiler.
101: It is (almost) equivalent to \fI(eval-when\ (compile)\ ...)\fP.
102: You may declare functions to be one of three types: lambda (*expr),
103: nlambda (*fexpr), lexpr (*lexpr).
104: The names in parenthesis are the Maclisp names and are accepted by the
105: compiler as well (and not just when the compiler is in Maclisp mode).
106: Functions are assumed to be lambdas until they are declared otherwise
107: or are defined differently.
108: The compiler treats calls to lambdas and lexprs equivalently, so you needn't
109: worry about declaring lexprs either.
110: It is important to declare nlambdas or define them before calling them.
111: Another attribute you can declare for a function is localf which
112: makes the function `local'.
113: A local function's name is
114: known only to the functions defined
115: within the file itself. The
116: advantage of a local function is that is can be entered
117: and exited very quickly and it can have the same name as a function in
118: another file and there will be no name conflict.
119: .pp
120: Variables may be declared special or unspecial.
121: When a special variable is lambda bound (either in a lambda,
122: prog or do expression), its old value is stored away on a stack for the
123: duration of the lambda, prog or do expression.
124: This takes time and is often not necessary.
125: Therefore the default classification for variables is unspecial.
126: Space for unspecial variables is dynamically allocated on a stack.
127: An unspecial variable can only be accessed from within the function
128: where it is created by its presence in a lambda, prog or do
129: expression variable list.
130: It is possible to declare that all variables are special as will be
131: shown below.
132: .pp
133: You may declare any number of things in each declare statement.
134: A sample declaration is
135: .ft I
136: .nf
137: (declare
138: \ \ \ \ \ (lambda func1 func2)
139: \ \ \ \ \ (*fexpr func3)
140: \ \ \ \ \ (*lexpr func4)
141: \ \ \ \ \ (localf func5)
142: \ \ \ \ \ (special var1 var2 var3)
143: \ \ \ \ \ (unspecial var4))
144: .fi
145: .ft R
146: .pp
147: You may also declare all variables to be special with
148: \fI(declare\ (specials\ t))\fP.
149: You may declare that macro definitions should be compiled as well as
150: evaluated at compile time by \fI(declare\ (macros\ t))\fP.
151: In fact, as was mentioned above, declare is much like
152: \fI(eval-when\ (compile)\ ...)\fP.
153: Thus if the compiler sees \fI(declare\ (foo\ bar))\fP
154: and foo is defined, then it will evaluate \fI(foo\ bar)\fP.
155: If foo is not defined then an undefined declare attribute warning will
156: be issued.
157: .sh +0 "(progn 'compile \fRform1 form2 ... formn\fB)\fP"
158: .pp
159: When the compiler sees this it simply compiles form1 through formn as if
160: they too were seen at top level.
161: One use for this is to allow a macro at top-level to
162: expand into more than one function definition for the compiler to compile.
163: .sh +0 "include/includef"
164: .pp
165: .i Include
166: and
167: .i includef
168: cause another file to be read and compiled by
169: the compiler. The result is the same as if the included file were
170: textually inserted into the original file. The only difference
171: between
172: .i include
173: and
174: .i includef
175: is that include doesn't evaluate its
176: argument and includef does. Nested includes are allowed.
177: .sh +0 "def"
178: .pp
179: A def form is used to define a function. The macros
180: .i defun
181: and
182: .i defmacro
183: expand to a def form.
184: If the function being defined is a lambda, nlambda or lexpr then
185: the compiler converts the lisp definition to a sequence of machine
186: language instructions.
187: If the function being defined is a macro, then the compiler will evaluate
188: the definition, thus defining the macro withing the running Lisp compiler.
189: Furthermore, if the variable
190: .i macros
191: is set to a non nil value, then the macro definition will also be translated
192: to machine language and thus will be defined when the object file is
193: fasled in.
194: The variable
195: .i macros
196: is set to t by
197: \fI(declare\ (macros\ t))\fP.
198: .pp
199: When a function or macro definition is compiled, macro expansion is
200: done whenever possible.
201: If the compiler can determine that a form would be evaluated if this
202: function were interpreted then it will macro expand it.
203: It will not macro expand arguments to a nlambda unless the characteristics
204: of the nlambda is known (as is the case with
205: .i cond).
206: The map functions (
207: .i map ,
208: .i mapc ,
209: .i mapcar ,
210: and so on)
211: are expanded to a
212: .i do
213: statement.
214: This allows the first argument to the map function to be a lambda
215: expression which references local variables of the function being
216: defined.
217: .sh +0 "other forms"
218: .pp
219: All other forms are simply stored in the object file and are evaluated
220: when the file is
221: .i fasl ed
222: in.
223: .sh 2 "Using the compiler"
224: .pp
225: The previous section describes exactly what the compiler does with its
226: input.
227: Generally you won't have to worry about all that detail as files which
228: work interpreted will work compiled.
229: Following is a list of steps you should follow to insure that a file
230: will compile correctly.
231: .ip [1]
232: Make sure all macro definitions precede their use in functions or other
233: macro definitions.
234: If you want the macros to be around when you
235: .i fasl
236: in the object file you should include this statement at the beginning
237: of the file: \fI(declare\ (macros\ t))\fP
238: .ip [2]
239: Make sure all nlambdas are defined or declared before they are used.
240: If the compiler comes across a call to a
241: function which has not been defined in the current file,
242: which does not currently have a function binding,
243: and whose type has not been declared then it will assume that the function
244: needs its arguments evaluated
245: (i.e. it is a lambda or lexpr) and will generate code
246: accordingly.
247: This means that you do not have to declare nlambda functions like
248: .i status
249: since they have an nlambda function binding.
250: .ip [3]
251: Locate all variables which are used for communicating values between
252: functions.
253: These variables must be declared special at the beginning of a file.
254: In most cases there won't be many special declarations but if you
255: fail to declare a variable special that should be, the compiled code
256: could fail in mysterious ways.
257: Let's look at a common problem, assume that a file contains just
258: these three lines:
259: .sp 2v
260: .ft I
261: (def aaa (lambda (glob loc) (bbb loc)))
262: .br
263: (def bbb (lambda (myloc) (add glob myloc)))
264: .br
265: (def ccc (lambda (glob loc) (bbb loc)))
266: .sp 2v
267: .ft R
268: We can see that if we load in these two definitions then (aaa 3 4) is
269: the same as (add 3 4) and will give us 7.
270: Suppose we compile the file containing these definitions.
271: When Liszt compiles aaa, it will assume that both glob and loc are local
272: variables and will allocate space on the temporary stack for their values
273: when aaa is called.
274: Thus the values of the local variables glob and loc
275: will not affect the values of the symbols glob and loc in the Lisp system.
276: Now Liszt moves on to function bbb.
277: Myloc is assumed to be local.
278: When it sees the add statement, it find a reference to a variable called
279: glob.
280: This variable is not a local variable to this function and therefore
281: glob must refer to the value of the symbol glob.
282: Liszt will automatically declare glob to be special and it will print
283: a warning to that effect.
284: Thus subsequent uses of glob will always refer to the symbol glob.
285: Next Liszt compiles ccc and treats glob as a special and loc
286: as a local.
287: When the object file is
288: .i fasl 'ed
289: in, and (ccc 3 4) is evaluated,
290: the symbol glob will be lambda bound to 3
291: bbb will be called and will return 7.
292: However (aaa 3 4) will fail since when bbb is called, glob will be unbound.
293: What should be done here is to put
294: \fI(declare\ (special\ glob)\fP
295: at the beginning of the file.
296: .ip [4]
297: Make sure that all calls to
298: .i arg
299: are within the lexpr whose arguments they reference.
300: If \fIfoo\fP is a compiled lexpr and it calls \fIbar\fP then \fIbar\fP cannot
301: use \fIarg\fP to get at \fIfoo\fP's arguments.
302: If both
303: .i foo
304: and
305: .i bar
306: are interpreted this will work however.
307: The macro
308: .i listify
309: can be used to put all of some of a lexprs arguments in a list which
310: then can be passed to other functions.
311: .sh 2 "Compiler options"
312: .pp
313: The compiler recognizes a number of options which are described below.
314: The options are typed anywhere on the command line preceded by a minus sign.
315: The entire command line is scanned and all options recorded before any action
316: is taken. Thus
317: .br
318: % liszt -mx foo
319: .br
320: % liszt -m -x foo
321: .br
322: % liszt foo -mx
323: .br
324: are all equivalent.
325: Before scanning the command line for options, liszt looks for in the
326: environment for the variable LISZT, and if found scans its value
327: as if it was a string of options.
328: The meaning of the options are:
329: .ip \fBC\fP
330: The assembler language output of the compiler is commented.
331: This is useful when debugging the compiler and is not normally done since
332: it slows down compilation.
333: .ip \fBI\fP
334: The next command line argument is taken as a filename, and loaded prior
335: to compilation.
336: .ip \fBe\fP
337: Evaluate the next argument on the command line before starting compilation.
338: For example
339: .br
340: % liszt -e '(setq foobar "foo string")' foo
341: .br
342: will evaluate the above s-expression. Note that the shell requires
343: that the arguments be surrounded by single quotes.
344: .ip \fBi\fP
345: Compile this program in interlisp compatibility mode.
346: This is not implemented yet.
347: .ip \fBm\fP
348: Compile this program in Maclisp mode.
349: The reader syntax will be changed to the Maclisp syntax and a file of
350: macro definitions will be loaded in (usually named /usr/lib/lisp/machacks).
351: This switch brings us sufficiently close to Maclisp to allow us to compile
352: Macsyma, a large Maclisp program.
353: However Maclisp is a moving target and we can't guarantee that this switch
354: will allow you to compile any given program.
355: .ip \fBo\fP
356: Select a different object or assembler language file name.
357: For example
358: .br
359: % liszt foo -o xxx.o
360: .br
361: will compile foo and into xxx.o instead of the default foo.o, and
362: .br
363: % liszt bar -S -o xxx.s
364: .br
365: will compile to assembler language into xxx.s instead of bar.s.
366: .ip \fBp\fP
367: place profiling code at the beginning of each non-local function.
368: If the lisp system is also created with profiling in it, this allows
369: function calling frequency to be determined (see \fIprof(1)\fP)
370: .ip \fBq\fP
371: Run in quiet mode.
372: The names of functions being compiled and various
373: "Note"'s are not printed.
374: .ip \fBQ\fP
375: print compilation statistics and warn of strange constructs.
376: This is the inverse of the \fBq\fP switch and is the default.
377: .ip \fBr\fP
378: place bootstrap code at the beginning of the object file, which when
379: the object file is executed will cause a lisp system to be invoked
380: and the object file \fIfasl\fPed in.
381: This is known as `autorun' and is described below.
382: .ip \fBS\fP
383: Create an assembler language file only.
384: .br
385: % liszt -S foo
386: .br
387: will create the file assembler language file foo.s and will not attempt
388: to assemble it.
389: If this option is not specified, the assembler language file will be put
390: in the temporary disk area under a automatically generated name based on
391: the lisp compiler's process id.
392: Then if there are no compilation errors, the assembler will be invoked to
393: assemble the file.
394: .ip \fBT\fP
395: Print the assembler language output on the standard output file.
396: This is useful when debugging the compiler.
397: .ip \fBu\fP
398: Run in UCI-Lisp mode.
399: The character syntax is changed to that of UCI-Lisp and a UCI-Lisp compatibility
400: package of macros is read in.
401: .ip \fBw\fP
402: Suppress warning messages.
403: .ip \fBx\fP
404: Create an cross reference file.
405: .br
406: % liszt -x foo
407: .br
408: not only compiles foo into foo.o but also generates the file foo.x\ .
409: The file foo.x is lisp readable and lists for each function all functions
410: which that function could call.
411: The program lxref reads one or more of these ".x" files and produces a
412: human readable cross reference listing.
413: .sh 2 autorun
414: .pp
415: The object file
416: which liszt writes does not contain all the functions necessary
417: to run the lisp program which was compiled.
418: In order to use the object file, a lisp system must be started and
419: the object file
420: .i fasl ed
421: in.
422: When the -r switch is given to liszt, the object file created will
423: contain a small piece of bootstrap code at the beginning, and the
424: object file will be made executable.
425: Now, when the name of the object file is given to the UNIX command
426: interpreter (shell) to run, the bootstrap code at the beginning
427: of the object file will cause a lisp system to be started and
428: the first action the lisp system will take is to
429: .i fasl
430: in the object file which started it.
431: In effect the object file has created an environment in which it can run.
432: .pp
433: Autorun is an alternative to
434: .i dumplisp .
435: The advantage of autorun is that the object file which starts the whole
436: process is typically small, whereas the minimum
437: .i dumplisp ed
438: file is very large (one half megabyte).
439: The disadvantage of autorun is that the file must be
440: .i fasl ed
441: into a lisp each time it is used whereas the file which
442: .i dumplisp
443: creates can be run as is.
444: liszt itself is a
445: .i dumplisp ed
446: file since it is used so often and is large enough that
447: too much time would be wasted
448: .i fasl ing
449: it in each time it was used.
450: The lisp cross reference program, lxref, uses
451: .i autorun
452: since it is a small and rarely used program.
453: .pp
454: In order to have the program
455: .i fasl ed
456: in begin execution
457: (rather than starting a lisp top level),
458: the value of the symbol user-top-level should be set to the name of the
459: function to get control. An example of this is shown next.
460: .Eb
461: \fIwe want to replace the unix date program with one written in lisp.\fP
462:
463: % \fBcat lispdate.l\fP
464: (de\kBfun mydate nil
465: \h'|\nBu'\kA(patom "The date is ")
466: \h'|\nAu'\kB(patom (status ctime))
467: \h'|\nBu'\kA(terpr)
468: \h'|\nAu'(exit 0))
469: (se\kAtq user-top-level 'mydate)
470:
471: % \fBliszt -r lispdate\fP
472: Compilation begins with Lisp Compiler 5.2
473: source: lispdate.l, result: lispdate.o
474: mydate
475: %Note: lispdate.l: Compilation complete
476: %Note: lispdate.l: Time: Real: 0:3, CPU: 0:0.28, GC: 0:0.00 for 0 gcs
477: %Note: lispdate.l: Assembly begins
478: %Note: lispdate.l: Assembly completed successfully
479: 3.0u 2.0s 0:17 29%
480:
481: \fI We change the name to remove the ".o", (this isn't necessary) \fP
482: % \fBmv lispdate.o lispdate\fP
483:
484: \fI Now we test it out \fP
485: % \fBlispdate\fP
486: The date is Sat Aug 1 16:58:33 1981
487: %
488: .Ee
489: .sh 2 "pure literals"
490: .pp
491: Normally the quoted lisp objects (literals) which appear in functions are
492: treated as constants.
493: Consider this function:
494: .br
495: .ft I
496:
497: (de\kCf foo
498: \h'|\nCu'(lambda nil (cond \kA(\kB(not (eq 'a (car (setq x '(a b)))))
499: \h'|\nBu'(print 'impossible!!))
500: \h'|\nAu'(t (rplaca x 'd)))))
501:
502: .ft P
503: .br
504: At first glance it seems that the first cond clause will never be
505: true, since the \fIcar\fP of \fI(a\ b)\fP should always be
506: .i a .
507: However if you run this function twice, it will print 'impossible!!' the
508: second time.
509: This is because the following clause modifies the 'constant' list \fI(a\ b)\fP
510: with the \fIrplaca\fP function.
511: Such modification of literal lisp objects can cause programs to behave
512: strangely as the above example shows, but more importantly it can cause
513: garbage collection problems if done to compiled code.
514: When a file is \fIfasl\fPed in, if the
515: symbol $purcopylits is non nil, the literal lisp data is put
516: in 'pure' space, that is it put in space which needn't be looked at
517: by the garabage collector. This reduces the work the garbage collector
518: must do but it is dangerous since if the literals are modified to point
519: to non pure objects, the marker may not mark the non pure objects.
520: If the symbol $purcopylits is nil then the literal lisp data is put in
521: impure space and the compiled code will act like the interpreted
522: code when literal data is modified.
523: The default value for $purcopylits is t.
524: .sh 2 "transfer tables"
525: .pp
526: A transfer table is setup by
527: .i fasl
528: when the object file is loaded in.
529: There is one entry in the transfer table for each function which is
530: called in that object file.
531: The entry for a call to the function
532: .i foo
533: has two parts whose contents are:
534: .ip [1]
535: function address \-
536: This will initially point to the internal function
537: .i qlinker .
538: It may some time in the future point to the function
539: .i foo
540: if certain conditions are satisfied (more on this below).
541: .ip [2]
542: function name \-
543: This is a pointer to the symbol
544: .i foo .
545: This will be used by
546: .i qlinker.
547: .sp 2v
548: .lp
549: When a call is made to the function
550: .i foo
551: the call will actually be made to the address in the
552: transfer table entry and will end up in the
553: .i qlinker
554: function.
555: .i Qlinker
556: will determine that
557: .i foo
558: was the function being called by locating the function name
559: entry in the transfer table\*[\(dg\*].
560: .(f
561: \*[\(dg\*]\fIQlinker\fP does this by tracing back the call stack until it
562: finds the \fIcalls\fP machine instruction which called it. The address
563: field of the \fIcalls\fP contains the address of the transfer table entry.
564: .)f
565: If the function being called is not compiled then
566: .i qlinker
567: just calls
568: .i funcall
569: to perform the function call.
570: If
571: .i foo
572: is compiled and if \fI(status\ translink)\fP is non nil, then
573: .i qlinker
574: will modify the function address part of the transfer table to point directly
575: to the function
576: .i foo .
577: Finally
578: .i qlinker
579: will call
580: .i foo
581: directly .
582: The next time a call is made to
583: .i foo
584: the call will go directly to
585: .i foo
586: and not through
587: .i qlinker .
588: This will result in a substantial speedup in compiled code to compiled code
589: transfers.
590: A disadvantage is that no debugging information is left on the stack,
591: so
592: .i showstack
593: and
594: .i baktrace
595: are useless.
596: Another disadvantage is that if you redefine a compiled function either
597: through loading in a new version or interactively defining it, then
598: the old version may still be called from compiled code if the fast linking
599: described above has already been done.
600: The solution to these problems is to use \fI(sstatus\ translink\ value)\fP.
601: If value is
602: .ip \fInil\fP
603: All transfer tables will be cleared, i.e. all function
604: addresses will be set to point to
605: .i qlinker .
606: This means that the next time a function is called
607: .i qlinker
608: will be called and will look at the current definition.
609: Also, no fast links will be set up since \fI(status\ translink)\fP
610: will be nil.
611: The end result is that
612: .i showstack
613: and
614: .i baktrace
615: will work and the function definition at the time of call will always be used.
616: .ip \fIon\fP
617: This causes the lisp system to go through all transfer tables and set up
618: fast links wherever possible.
619: This is normally used after you have
620: .i fasl ed
621: in all of your files.
622: Furthermore since \fI(status\ translink)\fP is not nil,
623: .i qlinker
624: will make new fast links if the situation arises (which isn't likely unless
625: you
626: .i fasl
627: in another file).
628: .ip \fIt\fP
629: This or any other value not previously mentioned will just make
630: \fI(status\ translink)\fP be non nil, and as a result fast links will
631: be made by
632: .i qlinker
633: if the called function is compiled.
634: .sh +0 "Fixnum functions"
635: .pp
636: The compiler will generate inline arithmetic code for fixnum only functions.
637: Such functions include \(pl, \(mi, *, /, \\, 1\(pl and 1\-.
638: The code generated will be much faster than using \fIadd\fP, \fIdifference\fP,
639: etc.
640: However it will only work if the arguments to and results of the functions
641: are fixnums.
642: No type checking is done.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.