|
|
1.1 root 1: \documentstyle{article}
2:
3: \begin{titlepage}
4: \normalsize
5: \vspace*{.7in}
6: \begin{center}
7: \begin{tabular}{c}
8: \bf\Large A Retargetable Compiler for ANSI C \\[.5in]
9:
10: Christopher W. Fraser \\
11: \em AT\&T Bell Laboratories, 600 Mountain Avenue, \\
12: \em Murray Hill, NJ 07974 \\[1ex]
13: and \\[1ex]
14: David R. Hanson \\
15: \em Department of Computer Science, Princeton University, \\
16: \em Princeton, NJ 08544 \\[.7in]
17:
18: Research Report CS-TR-303-91 \\[1ex]
19: February 1991 \\[.5in]
20:
21: \end{tabular}
22: \end{center}
23:
24: \begin{abstract}
25: \normalsize
26: \verb|lcc| is a new retargetable compiler for ANSI C.
27: Versions for the VAX, Motorola 68020, SPARC, and MIPS
28: are in production use at Princeton University
29: and at AT\&T Bell Laboratories.
30: With a few exceptions,
31: little about \verb|lcc| is unusual --- it integrates
32: several well engineered, existing techniques ---
33: but it is smaller and faster than most other C compilers,
34: and it generates code of comparable quality.
35: \verb|lcc|'s target-independent front end performs a
36: few simple, but effective, optimizations that contribute
37: to good code; examples include simulating register declarations
38: and partitioning switch statement cases into dense tables.
39: It also implements target-independent function tracing
40: and expression-level profiling.
41: \end{abstract}
42:
43: \end{titlepage}
44:
45: \begin{document}
46: %\bibliographystyle{abbrv}
47:
48: \setcounter{secnumdepth}{0}
49:
50: \section{Introduction}
51:
52: \verb|lcc| is a new retargetable compiler for ANSI C~\cite{ansi:Cstandard}.
53: It has been ported to the VAX, Motorola 68020, SPARC, and MIPS R3000,
54: and it is in production use at Princeton University
55: and at AT\&T Bell Laboratories.
56: When used with a compliant preprocessor and library,
57: \verb|lcc| passes the conformance section of
58: Version 2.00 of the Plum-Hall Validation Suite for ANSI~C.%
59: \footnote{The {\tt lcc} front end and a sample code generator are available
60: for anonymous {\tt ftp} from {\tt princeton.edu}.
61: The file {\tt README} in the directory {\tt pub/lcc} gives details.
62: It also describes the current availability of {\tt lcc}'s
63: production code generators.}
64:
65: Other reports describe \verb|lcc|'s storage manager~\cite{hanson90},
66: intermediate language~\cite{fraser:hanson:91a}, code
67: generator~\cite{fraser:sigplan89}, and register
68: manager~\cite{fraser:hanson:92}. This report surveys the
69: remaining features of \verb|lcc| that may interest some readers. Chief
70: among these are its performance, some aspects of its design that
71: support this performance, and some features for debugging and profiling
72: user code.
73:
74: \section{Design}
75:
76: With a few exceptions, \verb|lcc| uses
77: well established compiler techniques.
78: The front end performs lexical, syntactic, and semantic analysis, and
79: some machine-independent optimizations, which are described below.
80: Both the lexical analyzer and the recursive-descent parser are hand-written.
81: Theoretically, this approach complicates both future changes and fixing errors,
82: but accommodating change is less important for a standardized language like ANSI~C,
83: and there have been few lexical or syntactic errors.
84: Indeed, less than 15~percent of \verb|lcc|'s code concerns parsing,
85: and the error rate in that code is negligible.
86: Despite its theoretical prominence,
87: parsing is a relatively minor component in \verb|lcc| as in other compilers;
88: semantic analysis, optimization, and code generation are the major components
89: and account for most of the code and most of the errors.
90:
91: The target-independent front end and a target-dependent back end are
92: packaged as single program, tightly coupled by a
93: compact, efficient interface.
94: The interface consists of a few shared data structures,
95: 17 functions, and a 36-operator dag language.
96: The functions emit function prologues,
97: define globals, emit data, etc., and most are simple.
98: The dag language encodes the executable code from a source program; it
99: corresponds to the ``intermediate language'' used in other compilers, but
100: it is smaller than typical intermediate languages.
101: Reference~\cite{fraser:hanson:91a} describes the interface.
102:
103: Code generators are generated automatically from compact, rule-based
104: specifications~\cite{fraser:sigplan89}. Some rules rewrite
105: intermediate code as naive assembly code. Others peephole-optimize the
106: result. They are compiled into a monolithic hard-coded program that
107: accepts dags annotated with intermediate code, and generates,
108: optimizes, and emits code for the target machine. Hard-coding
109: contributes significantly to \verb|lcc|'s speed.
110:
111: Table~\ref{linecounts} shows the number of lines
112: in each of \verb|lcc|'s components. The notation $h+c$ indicates
113: $h$ lines of definitions in ``header'' files and $c$ lines of C code.
114: The ``back-end support'' is back-end code that
115: shared by four back ends, e.g., initialization and
116: most of the register manager.
117:
118: \begin{table}
119: \begin{center} % for v1.3
120: \newlength{\W}\settowidth{\W}{symbol table emitters\ }
121: \begin{tabular}{l|rcc}
122: & & & \it generated \\
123: \hfil\it component & \omit\hfil\it code\hfil& \it rules & \it code generator \\ \hline
124: front end & 968+7847 \\[2ex]
125: back-end support & 114+741 \\
126: VAX & 35+170 & 178 & 5782 \\
127: MIPS & 40+378 & 104 & 2966 \\
128: 68020 & 42+190 & 145 & 8301 \\
129: SPARC & 40+290 & 128 & 3888 \\[2ex]
130: \parbox{\W}{VAX+68020+SPARC\\[-2pt]
131: symbol table emitters} & 584 \\[2ex]
132: \parbox{\W}{naive VAX\\[-2pt]
133: code generator} & 67+578 \\[2ex]
134: rule compiler & 1285 \\
135: \end{tabular}
136: \end{center}
137: \caption{Number of Lines in {\tt lcc} Components.\label{linecounts}}
138: \end{table}
139:
140:
141: Target-specific files include a
142: configuration header file, which defines parameters like
143: the widths and alignments of the basic datatypes,
144: target-specific interface functions, e.g., those that emit function prologues,
145: and code generation rules, from which the code generators are generated
146: by the rule compiler, which is written in Icon~\cite{griswold:griswold:90}.
147: Retargeting \verb|lcc| requires involves writing these
148: three back-end components, which vary from 377 to 522 lines in existing back ends.
149: In practice, new back ends are implemented
150: by writing new rules and editing copies
151: of an existing configuration and set of interface functions.
152:
153: All of \verb|lcc|'s production back ends use the technology
154: summarized above and detailed in Reference~\cite{fraser:sigplan89}.
155: The interface between the front and back end does not depend
156: on this technology; other back ends that conform to the interface
157: specification can be used. For example, Reference~\cite{fraser:hanson:91a}
158: details a hand-written code generator that emits naive VAX code.
159:
160: While \verb|lcc| uses well established techniques, it uses some of their
161: more recent incarnations, each of which contributes to \verb|lcc|'s efficiency
162: as described below.
163:
164: \subsection{Lexical Analysis}
165:
166: The design of the input module and of
167: the lexical analyzer and judicious code tuning
168: of the lexical analyzer contribute to
169: \verb|lcc|'s speed.
170:
171: Input and lexical analysis use
172: variations of the design described in Reference~\cite{waite86}.
173: Since the lexical analyzer is the only module that
174: inspects every input character, the design
175: avoids extraneous per-character processing and
176: minimizes character movement by scanning tokens directly out
177: of a large input buffer.
178:
179: Input is read directly from the operating system
180: into a 4096-character buffer as depicted
181: in Figure~\ref{fig:buffering}a, and \verb|cp| and \verb|limit|
182: delimit the unscanned portion of the buffer.
183: The next token is scanned by advancing \verb|cp| across
184: white space and switching on the first character of
185: the token, \verb|*cp|. \verb|cp| is advanced as the token is recognized.
186:
187: \begin{figure}
188: \begin{center}
189: \setlength{\unitlength}{10pt}
190: \begin{picture}(26.3,21)
191: \put(0,2){
192: \begin{picture}(26.3,7)
193: \thicklines
194: \put( 0, 3){\framebox(26.3, 2)[r]{\tt \char`\\n}}
195: \thinlines
196: \put( 5, 3){\line(0,1){2}}
197: \put(25, 3){\line(0,1){2}}
198: \put( 3, 1){\vector(0,1){2}}\put(3, 0){\makebox(0,1){\tt cp}}
199: \put(25.7, 1){\vector(0,1){2}}\put(25.7, 0){\makebox(0,1){\tt limit}}
200: \put(22, 1){\vector(0,1){2}}\put(22, 0){\makebox(0,1){\tt fence}}
201: \put(15, 6){\makebox(0,0){4096 characters}}
202: \put( 5, 5.5){\line(0,1){1}}
203: \put(25, 5.5){\line(0,1){1}}
204: \put(11, 6){\vector(-1,0){6}}
205: \put(19, 6){\vector( 1,0){6}}
206: \end{picture}
207: }
208: \put(13.15,0){\makebox(0,1){(b) After a \tt read}}
209:
210: \put(0,14){
211: \begin{picture}(26.3,7)
212: \thicklines
213: \put( 0, 3){\framebox(26.3, 2)[r]{\tt \char`\\n}}
214: \thinlines
215: \put( 5, 3){\line(0,1){2}}
216: \put(25, 3){\line(0,1){2}}
217: \put(10, 1){\vector(0,1){2}}\put(10, 0){\makebox(0,1){\tt cp}}
218: \put(25.7, 1){\vector(0,1){2}}\put(25.7, 0){\makebox(0,1){\tt limit}}
219: \put(22, 1){\vector(0,1){2}}\put(22, 0){\makebox(0,1){\tt fence}}
220: \put(15, 6){\makebox(0,0){4096 characters}}
221: \put( 5, 5.5){\line(0,1){1}}
222: \put(25, 5.5){\line(0,1){1}}
223: \put(11, 6){\vector(-1,0){6}}
224: \put(19, 6){\vector( 1,0){6}}
225: \end{picture}
226: }
227: \put(13.15,12){\makebox(0,1){(a) While ${\tt cp} < {\tt fence}$}}
228:
229: \end{picture}
230: \end{center}
231: \caption{Input Buffering.\label{fig:buffering}}
232: \end{figure}
233:
234: Newlines, denoted by \verb|\n|, cannot occur within C tokens,
235: which explains the newline at \verb|*limit| shown in Figure~\ref{fig:buffering}.
236: This newline terminates a scan for any token
237: so a separate, per-character test for the end of the buffer is unnecessary.
238: When a newline is encountered, an input module function is called
239: to refill the input buffer,
240: if necessary, and to increment the line number.
241:
242: ANSI~C stipulates a maximum line length of no less than 509,
243: but few compilers insist on a specific limit.
244: Tokens, however, can be limited to 32 characters;
245: string literals are an exception,
246: but they are handled as a special case.
247:
248: In general, an input buffer ends with a partial token.
249: To insure that an entire token lies between
250: \verb|cp| and \verb|limit|, the end of the buffer is moved to the memory
251: locations {\em preceding} the buffer whenever
252: \verb|cp| passes \verb|fence|. Doing so concatenates
253: a partial token with its tail after the next read
254: as shown in Figure~\ref{fig:buffering}b.
255: Testing if \verb|cp| has passed \verb|fence| is done
256: for each token after \verb|cp| is advanced across white space.
257:
258: The important consequence of this design is that most
259: of the input characters are accessed by \verb|*cp|
260: and many are never moved. Only identifiers (excluding keywords) and string
261: literals that appear in executable code
262: are copied out of the buffer into permanent storage.
263:
264: Reference~\cite{waite86}'s algorithm moves partial {\em lines} instead
265: of partial tokens and does so after scanning the {\em first} newline in
266: the buffer. But this operation overwrites storage before the buffer
267: when a partial line is longer than a fixed maximum. The algorithm
268: above avoids this problem, but at the per-token cost of comparing
269: \verb|cp| with \verb|fence|.
270:
271: Instead of actually using \verb|cp| as suggested above,
272: \verb|cp| is copied to the register variable \verb|rcp|
273: upon entry to the lexical analyzer, and \verb|rcp| is used in
274: token recognition. \verb|rcp| is assigned to \verb|cp|
275: before the lexical analyzer returns.
276: Using \verb|rcp| improves performance
277: and makes scanning loops compact and fast, e.g., white space is elided by
278: \begin{verbatim}
279: while (map[*rcp]&BLANK)
280: rcp++;
281: \end{verbatim}
282: \verb|map[c]| is a mask that classifies character \verb|c| as suggested
283: in Reference~\cite{waite86}; e.g.,
284: \verb|map[c]&BLANK| is non-zero if \verb|c| is a
285: white-space character (but not a newline).
286: \verb|lcc| generates four VAX instructions for the body of this loop:
287: \begin{verbatim}
288: jbr L142
289: L141: incl r11
290: L142: cvtbl (r11),r5
291: bicl3 $-2,_map[r5],r5
292: jneq L141
293: \end{verbatim}
294: \verb|rcp| is register \verb|r11|.
295: Some optimizing compilers can make similar improvements locally, but not across
296: potentially aliased assignments and calls to other, irrelevant functions.
297:
298: Keywords are recognized by a hard-coded decision tree, e.g.,
299: \begin{verbatim}
300: case 'i':
301: if (rcp[0] == 'f'
302: && !(map[rcp[1]]&(DIGIT|LETTER))) {
303: cp = rcp + 1;
304: return IF;
305: }
306: if (rcp[0] == 'n'
307: && rcp[1] == 't'
308: && !(map[rcp[2]]&(DIGIT|LETTER))) {
309: cp = rcp + 2;
310: return INT;
311: }
312: goto id;
313: \end{verbatim}
314: \verb|IF| and \verb|INT| are defined as the token codes for
315: the keywords \verb|if| and \verb|int|, respectively, and
316: \verb|id| labels the code that scans identifiers.
317: This code is generated automatically by a 50-line C program
318: and included in the lexical analyzer during compilation.
319:
320: The VAX code generated for this fragment follows;
321: again, \verb|r11| is \verb|rcp|.
322: \begin{verbatim}
323: L347: cmpb (r11),$102
324: jneq L348
325: cvtbl 1(r11),r5
326: bicl3 $-13,_map[r5],r5
327: jneq L348
328: addl3 $1,r11,_cp
329: movl $77,r0
330: ret
331: L348: cmpb (r11),$110
332: jneq L226
333: cmpb 1(r11),$116
334: jneq L226
335: cvtbl 2(r11),r5
336: bicl3 $-13,_map[r5],r5
337: jneq L226
338: addl3 $2,r11,_cp
339: movl $5,r0
340: ret
341: \end{verbatim}
342: Thus, the keyword \verb|int| is recognized by less than
343: a dozen instructions, many less than are executed
344: when a table is searched for keywords, even if perfect hashing is used.
345:
346: As in other compilers~\cite{aho:sethi:ullman:86},
347: strings that must be saved (identifiers and string literals)
348: are hashed into a table in which a string appears only once,
349: which saves space. For performance, there are variants for installing strings of digits
350: and strings of known length.
351: After installation, strings are known by their addresses
352: and the characters are accessed only for output.
353: For example, looking a name up in the symbol table is
354: done by hashing on the address of the name; string comparison is unnecessary.
355:
356: \subsection{Symbol Tables}
357:
358: Fast symbol table manipulation also contributes to
359: \verb|lcc|'s speed. It took several versions
360: of the symbol table module to arrive at the current one, however.
361:
362: Symbols are represented with structures defined by
363: \begin{verbatim}
364: struct symbol {
365: char *name; /* symbol name */
366: int scope; /* scope level */
367: ...
368: };
369: \end{verbatim}
370: The symbol table module uses hash
371: tables for symbol tables; the initial version used
372: a single table for all scopes, i.e.,
373: \begin{verbatim}
374: struct entry {
375: struct symbol sym; /* this symbol */
376: struct entry *link; /* next entry on hash chain */
377: };
378: struct table {
379: struct entry *buckets[HASHSIZE]; /* hash buckets */
380: };
381: \end{verbatim}
382: Symbols are wrapped in \verb|entry| structures to keep
383: the linkage information private to the symbol table module.
384:
385: Scope entry required no code.
386: Each new symbol was added to the head of its hash chain
387: and thereby hid symbols with the same names,
388: which appeared further down on the same chains.
389: At scope exit, however, entries at the current
390: scope level, indicated by the value of \verb|level|,
391: were removed from the table \verb|*tp| by the code
392: \begin{verbatim}
393: for (i = 0; i < HASHSIZE; i++) {
394: struct entry *p = tp->buckets[i];
395: while (p && p->sym.scope == level)
396: p = p->link;
397: tp->buckets[i] = p;
398: }
399: \end{verbatim}
400: Measurements revealed that this code accounted for over
401: 5~percent of \verb|lcc|'s execution time on typical input.
402: This code scanned the hash buckets even for scopes that
403: introduce no new symbols, which are common in C.
404:
405: The second version of the symbol table module used
406: a separate hash table for each scope level:
407: \begin{verbatim}
408: struct table {
409: struct table *previous; /* table at lower scope */
410: struct entry *buckets[HASHSIZE]; /* hash buckets */
411: };
412: \end{verbatim}
413: Searching for a symbol took the same number
414: of comparisons, but also required a traversal
415: of the list of separate tables, e.g.,
416: \begin{verbatim}
417: struct symbol *lookup(char *name, struct table *tp) {
418: struct entry *p;
419: unsigned h = ((unsigned)name)&(HASHSIZE-1);
420:
421: do
422: for (p = tp->buckets[h]; p; p = p->link)
423: if (name == p->sym.name)
424: return &p->sym;
425: while (tp = tp->previous);
426: return 0;
427: }
428: \end{verbatim}
429: Notice that symbol names are compared by simply
430: comparing addresses as explained in the previous section.
431: Despite the conventional wisdom about hashing functions~\cite{sedgewick88},
432: using a power of two for \verb|HASHSIZE| gave
433: better performance; using a prime instead and modulus in place
434: of masking slowed \verb|lcc|.
435:
436: This variation reduced the scope exit code to
437: \begin{verbatim}
438: tp = tp->previous
439: \end{verbatim}
440: for table \verb|*tp|. Unfortunately, scope entry
441: then required allocation and initialization of a table:
442: \begin{verbatim}
443: struct table *new = (struct table *)alloc(sizeof *new);
444: new->previous = tp;
445: for (i = 0; i < HASHSIZE; i++)
446: new->buckets[i] = 0;
447: tp = new;
448: \end{verbatim}
449: So, the time wasted at scope exit in the first version
450: was traded for a similar waste at scope entry in the second version.
451:
452: The symbol table module in actual use avoids
453: this waste by lazy allocation and initialization
454: of tables. Tables include their associated scope level:
455: \begin{verbatim}
456: struct table {
457: int level; /* scope level for this table */
458: struct table *previous; /* table at lower scope */
459: struct entry *buckets[HASHSIZE]; /* hash buckets */
460: };
461: \end{verbatim}
462: New tables are allocated and initialized only
463: when a symbol is installed:
464: \begin{verbatim}
465: struct symbol *install(char *name, struct table **tpp) {
466: unsigned h = ((unsigned)name)&(HASHSIZE-1);
467: struct table *tp = *tpp;
468: struct entry *p = (struct entry *)alloc(sizeof *p);
469:
470: if (tp->level < level) {
471: int i;
472: struct table *new = (struct table *)alloc(sizeof *new);
473: new->previous = tp;
474: new->level = level;
475: for (i = 0; i < HASHSIZE; i++)
476: new->buckets[i] = 0;
477: *tpp = tp = new;
478: }
479: p->sym.name = name;
480: p->sym.scope = tp->level;
481: ...
482: p->link = tp->buckets[h];
483: tp->buckets[h] = p;
484: return &p->sym;
485: }
486: \end{verbatim}
487: Since few scopes in C, which are delimited by compound statements,
488: declare new symbols, the lazy allocation code above is rarely
489: executed and entry to most scopes is nearly free.
490: The scope exit code must check before
491: discarding a table, but remains simple:
492: \begin{verbatim}
493: if (tp->level == level)
494: tp = tp->previous;
495: \end{verbatim}
496:
497: This design also simplifies access to separate tables.
498: For example, the table that holds globals is at the end
499: of the list of identifier tables; by making it
500: the value of \verb|globals|, symbols can be
501: installed into it directly.
502: In the initial implementation, a global declared at a nested
503: scope had to be inserted in the middle of its hash chain.
504:
505: \subsection{Storage Management}
506:
507: Allocation and deallocation in early versions of \verb|lcc|
508: accounted for a significant portion of the total execution time.
509: Replacing the naive use of \verb|malloc| and \verb|free|
510: reduced total execution time by about 8--10~percent.
511: As detailed in Reference~\cite{hanson90},
512: allocation is based on the lifetime of the objects allocated,
513: and all objects with the same lifetime are freed at once.
514:
515: This approach to storage management simplified \verb|lcc|'s code.
516: Initially, each object type had explicit deallocation code, perhaps
517: replicated at several points. Some of this code was intricate, e.g.,
518: involving complex loops or recursive data structure traversals.
519: Allocation incurred an obligation to provide the necessary
520: deallocation code, so there was a tendency to use
521: algorithms that avoided allocation,
522: perhaps at the expense of time, complexity, and flexibility.
523: And it was easy to forget deallocation, resulting in storage leaks.
524:
525: The current scheme eliminated nearly all explicit deallocation code,
526: which simplified the compiler and eliminated storage leaks.
527: More importantly, it encouraged the use of simple applicative algorithms,
528: e.g., in rewriting trees.
529: The replacements cost space, but not time, since
530: allocation and deallocation are nearly free.
531: Besides contributing to fast compilation, the other visible
532: benefit of this approach is that \verb|lcc| imposes few
533: arbitrary limits on its input; e.g., it permits
534: any number of cases in switch statements,
535: any number of parameters and locals, block nesting to any depth,
536: expressions of arbitrary complexity, initializations of arbitrary size, etc.
537: These quantities are limited only by the memory available.
538:
539:
540: \section{Optimization}
541:
542: \verb|lcc| is not properly called an ``optimizing'' compiler
543: because it does no global optimization, {\em per se}.
544: Its front end does, however, perform some simple, target-independent
545: transformations that help its back ends generate good local code.
546:
547: The front end eliminates local common subexpressions,
548: folds constant expressions, and makes numerous simple
549: transformations that improve the quality of local code~\cite{hanson83}.
550: Many of these improvements are simple tree transformations
551: that lead to better addressing code.
552:
553: The front end lays out loops so as to reduce the number
554: of unconstructive branches~\cite{baskett78}, e.g., the code for
555: \begin{flushleft}
556: \tt for ($e_1$; $e_2$; $e_3$) $S$
557: \end{flushleft}
558: has the form
559: \begin{flushleft}\tt
560: \begin{tabular}{ll}
561: & $e_1$ \\
562: & goto L1 \\
563: L2: & $S$ \\
564: L3: & $e_3$ \\
565: L1: & if ($e_2$) goto L2 \\
566: \end{tabular}
567: \end{flushleft}
568: The \verb|goto L1| is omitted if $e_2$ is initially non-zero.
569: In addition, the front end eliminates branch chains and dead branches.
570:
571: The selection code for switch statements is generated entirely by the front end.
572: It generates a binary search of dense branch tables~\cite{bernstein85},
573: where the density is the percentage of non-default branch table entries.
574: For example, with the default density of 0.5, a switch statement
575: with the case values 1, 2, 6--8, 1001--1004, and 2001--2002 has the
576: following VAX selection code.
577: Register \verb|r4| holds the value of the switch expression,
578: \verb|L3|--\verb|15| label the statements for the case values above,
579: and \verb|L1| is the default label.
580: \begin{verbatim}
581: cmpl r4,$1001
582: jlss L17
583: cmpl r4,$1004
584: jgtr L16
585: movl _18-4004[r4],r5
586: jmp (r5)
587: _18: .long L8
588: .long L9
589: .long L10
590: .long L11
591: L17: cmpl r4,$1
592: jlss L1
593: cmpl r4,$8
594: jgtr L1
595: movl _21-4[r4],r5
596: jmp (r5)
597: _21: .long L3
598: .long L4
599: .long L1
600: .long L1
601: .long L1
602: .long L5
603: .long L6
604: .long L7
605: L16: cmpl r4,$2001
606: jlss L1
607: cmpl r4,$2004
608: jgtr L1
609: movl _24-8004[r4],r5
610: jmp (r5)
611: _24: .long L12
612: .long L13
613: .long L14
614: .long L15
615: \end{verbatim}
616: The density can be changed by a command-line option;
617: e.g., \verb|-d0| yields a single branch
618: table for each switch statement, and \verb|-d1| requires that all
619: branch tables be fully populated.
620:
621: Finally, the front end simulates register declarations for
622: all scalar parameters and locals that are referenced at least
623: 3 times and do not have their addresses taken explicitly.
624: Locals are announced to the back ends with
625: explicitly declared \verb|register| locals followed by
626: the remaining locals in the order of decreasing frequency of use.
627: Each top-level occurrence of an identifier
628: counts as 1 reference. Occurrences in a loop,
629: either of the then/else arms of an if statement, or a case
630: in a switch statement each count, respectively, as 10, $1/2$, or $1/10$ references.
631: These values are adjusted to account for nested control structures.
632: The next section describes how these estimated counts
633: may be replaced with counts from an actual profile.
634:
635: This scheme simplifies register assignment in the back ends,
636: and explicit \verb|register| declarations are rarely necessary.
637: For example,
638: \begin{verbatim}
639: strcpy(char *s1, char *s2) { while (*s1++ = *s2++); }
640: \end{verbatim}
641: yields the VAX code
642: \begin{verbatim}
643: _strcpy:.word 0x0
644: movl 4(ap),r4
645: movl 8(ap),r5
646: L26: movb (r5)+,(r4)+
647: jneq L26
648: ret
649: \end{verbatim}
650:
651:
652: \section{Features}
653:
654: \verb|lcc| provides a few noteworthy features that help users develop,
655: debug, and profile ANSI~C programs.
656: For example, an option causes
657: \verb|lcc| to print ANSI-style C declarations for all defined globals and functions.
658: For instance, the code (adapted from Section~6.2 of Reference~\cite{kernighan:ritchie:88})
659: \begin{verbatim}
660: typedef struct point { int x,y; } point;
661: typedef struct rect { point pt1, pt2; } rect;
662:
663: point addpoint(p1, p2) point p1, p2; {
664: p1.x += p2.x;
665: p1.y += p2.y;
666: return p1;
667: }
668: int ptinrect(p, r) point p; rect r; {
669: return p.x >= r.pt1.x && p.x < r.pt2.x
670: && p.y >= r.pt1.y && p.y < r.pt2.y;
671: }
672: \end{verbatim}
673: generates the declarations
674: \begin{verbatim}
675: extern point addpoint(point, point);
676: extern int ptinrect(point, rect);
677: \end{verbatim}
678: Editing such output can simplify conversion to ANSI~C.
679:
680: Another option causes \verb|lcc| to
681: issue warnings for declarations and casts of function types without prototypes.
682: These include pointers to functions, which are easy to overlook
683: when updating pre-ANSI code. For example, it is likely
684: that \verb|char *(_alloc)()| should
685: be updated to be \verb|char *(_alloc)(size_t)|.
686:
687: \subsection{Debugging}
688:
689: \verb|lcc| supports the standard debugger symbol tables on VAXes and Suns.
690: It also has two options of its own to assist in program debugging.
691:
692: Dereferencing zero pointers is a frequent C programming error.
693: On some systems, execution continues until the consequences
694: cause a fault somewhere unrelated to the actual point of error.
695: To help catch such errors, an option causes \verb|lcc|
696: to generate code to test for dereferencing zero pointers. If a zero pointer
697: is detected, the offending file name and line number are reported on the standard error, e.g.,
698: \begin{verbatim}
699: null pointer dereferenced @foo.c:36
700: \end{verbatim}
701: and the program terminates by calling the standard library function \verb|abort|.
702:
703: Some languages provide built-in facilities for tracing function calls
704: and returns~\cite{griswold:griswold:90}.
705: An option instructs \verb|lcc| to generate calls to \verb|printf| (or a
706: user-specified equivalent) just after entry to each function
707: and just before each return.
708: The entry code prints the arguments and the return code prints the value
709: returned. For example, calling the
710: functions shown above would elicit messages like
711: \begin{verbatim}
712: addpoint#2(p1=(point){x=0,y=0},p2=(point){x=10,y=10}) called
713: addpoint#2 returned (point){x=10,y=10}
714: ...
715: ptinrect#1(p=(point){x=-1,y=-1},
716: r=(rect){pt1=(point){x=10,y=10},pt2=(point){x=310,y=310}}) called
717: ptinrect#1 returned 0
718: \end{verbatim}
719: (Long lines have been folded to fit this page.)
720: As illustrated by this output,
721: the messages show the full details of the arguments, including structure contents.
722: The numbers that follow function names, e.g., \verb|#2|,
723: are activation numbers and can help locate a specific call and its return.
724:
725: These debugging options are implemented entirely in the front end and
726: thus are available on all of \verb|lcc|'s targets.
727:
728: \subsection{Profiling}
729:
730: \verb|lcc| supports \verb|prof|-style (viz.~\cite[\verb|prof| command]{bsd84})
731: and \verb|gprof|-style~\cite{graham:kessler:mckusick:83} execution profiling
732: on VAXes and Suns.
733: These profilers sample the location counter periodically
734: to obtain an estimate of the percentage of total execution time
735: spent in each function, and they report the number of calls to each function.
736:
737: Heeding long-standing advice~\cite{knuth71,sites78}, \verb|lcc| also supports
738: frequency-based profiling.
739: An option causes \verb|lcc| to emit counters that record the number
740: of times each {\em expression} is executed, and the values of these counters are
741: written to the file \verb|prof.out| when the program terminates.
742: A companion program, \verb|bprint|, reads \verb|prof.out| and prints
743: the source code annotated with execution counts, e.g.,
744: \begin{verbatim}
745: ...
746: 4 main()
747: 5 <1>{
748: ...
749: 12 <1>queens(0);
750: 13 return <1>0;
751: 14 <1>}
752: 15
753: 16 queens(c)
754: 17 <1965>{
755: 18 int r;
756: 19
757: 20 for (<1965>r = 0; <15720>r < 8; <15720>r++)
758: 21 if (<15720>rows[r] && <5508>up[r-c+7] && <3420>down[r+c]){
759: 22 <2056>rows[r] = up[r-c+7] = down[r+c] = 0;
760: 23 <2056>x[c] = r;
761: 24 if (<2056>c == 7)
762: 25 <92>print();
763: 26 else
764: 27 <1964>queens(c + 1);
765: 28 <2056>rows[r] = up[r-c+7] = down[r+c] = 1;
766: 29 }
767: 30 <1965>}
768: ...
769: \end{verbatim}
770: Execution counts are enclosed in angle brackets.
771: The counts on the outermost braces for \verb|queens|
772: give the number of calls.
773: Line 21 shows the benefit of associating a count
774: with each expression instead of each line;
775: the counts reveal that \verb|up[r-c+7]| was tested only slightly more than one-third
776: of the number of times the if statement was executed.
777: Conditional expressions are annotated similarly.
778:
779: Users sometimes report an ``off-by-one'' bug when they see that
780: \verb|r < 8| in line 20 was executed
781: the same number of times as \verb|r++|.
782: These counts are a consequence of the way \verb|lcc| lays out for loops
783: and eliminates the test before the first iteration, as described above.
784:
785: Data in \verb|prof.out| accumulates, so it is possible to execute
786: a program repeatedly and then have \verb|bprint| display the
787: cumulative frequencies.
788: This method is particularly useful for developing
789: test data that exercises all parts of a program:
790: \verb|<0>| highlights untested code.
791:
792: Another option causes \verb|lcc| to read \verb|prof.out|
793: and use the counts therein to compute the frequency of use
794: of each identifier instead of using the estimates
795: described in the previous section. Doing so may reduce the
796: number of uses for identifiers that appear in
797: loops that rarely executed more than once,
798: and increase the number of uses for those that appear
799: in then/else arms that are executed most of the time.
800:
801: Complex preprocessor macros can obscure \verb|bprint|'s presentation.
802: It necessarily uses post-expansion source coordinates
803: to annotate pre-expansion source files.
804:
805: Profiling code also records the number of calls made
806: from each call site, which can be used to reconstruct the dynamic
807: call graph. \verb|bprint| prints a line for each edge, e.g.,
808: \begin{verbatim}
809: 1 queens from main in 8q.c:12.8
810: 1964 queens from queens in 8q.c:27.11
811: 92 print from queens in 8q.c:25.10
812: \end{verbatim}
813: This output shows that all but one of the calls to \verb|queens| was from
814: the call at character 11 in line 27.
815: This kind of data is particularly helpful in identifying
816: hot spots that are caused by inappropriate calls to a function
817: instead of inefficiencies within the function itself.
818: Such data can also help identify functions that might profitably
819: be replaced with two functions so that one can handle
820: the common case more efficiently~\cite[Sec.~5.3]{bentley82}.
821:
822: Expression execution frequency profiling is implemented entirely by the
823: front end. The only machine dependency is the name of the ultimate
824: termination function in the revised \verb|exit| function that writes
825: \verb|prof.out| at program termination.
826:
827: The implementation is a machine-independent variation of the method
828: described in Reference~\cite{weinberger84}. The front end generates an
829: array of counters for each file and starts each expression with code to
830: increment the appropriate counter. In also builds a parallel array that
831: holds the source coordinates corresponding to each counter. At the
832: entry point of each function, the front end generates the equivalent of
833: \begin{verbatim}
834: if (!_yylink.link) {
835: extern struct _bbdata *_bblist;
836: _yylink.link = _bblist;
837: _bblist = &yylink;
838: }
839: _prologue(&callee);
840: \end{verbatim}
841: A \verb|_bbdata| structure is generated for each file:
842: \begin{verbatim}
843: static struct _bbdata {
844: struct _bbdata *link;
845: unsigned npoints;
846: unsigned *counts;
847: unsigned *coords;
848: struct func *funcs;
849: } _yylink;
850: \end{verbatim}
851: The \verb|counts| and \verb|coords| fields point the arrays mentioned above,
852: which each have \verb|npoints| entries.
853: The entry point code uses the \verb|link| field to
854: add each file's \verb|_bbdata| structure
855: to the list headed by \verb|_bblist|, which the revised \verb|exit| function walks
856: to emit \verb|prof.out|.
857:
858: \verb|_prologue| accumulates the dynamic call graph.
859: It is passed one of the \verb|func| structures --- one for each function --- that appear
860: on the list emanating from \verb|_yylink.funcs|:
861: \begin{verbatim}
862: struct func {
863: struct func *link;
864: struct caller {
865: struct caller *link;
866: struct callsite *caller;
867: unsigned count;
868: } *callers;
869: char *name;
870: unsigned coord;
871: };
872: \end{verbatim}
873: The \verb|name| and \verb|coord| fields give the function's name and beginning source
874: coordinate, respectively.
875: \verb|callers| points to a list of \verb|caller| structures, one for each
876: call site. Each \verb|caller| structure records the number of calls
877: from the caller's \verb|callsite|:
878: \begin{verbatim}
879: struct callsite {
880: char *file;
881: char *name;
882: unsigned coord;
883: };
884: \end{verbatim}
885: \verb|caller| structures are allocated at execution time
886: and point to \verb|callsite|s, which are generated by the front end
887: at compile time.
888:
889: Just before each call, the front end generates an assignment
890: of a pointer to a \verb|callsite| structure to the global variable \verb|_caller|.
891: \verb|_prologue| uses \verb|_caller| to record an edge in the dynamic call graph.
892: If a record of the caller already exists, its count is simply incremented.
893: Otherwise, a \verb|caller| structure is allocated and prefixed to
894: the callee's list of callers.
895: \begin{verbatim}
896: _prologue(struct func *callee) {
897: static struct caller callers[4096];
898: static int next;
899: struct caller *p;
900:
901: for (p = callee->callers; p; p = p->link)
902: if (p->caller == _caller) {
903: p->count++;
904: break;
905: }
906: if (!p && next < sizeof callers/sizeof callers[0]) {
907: p = &callers[next++];
908: p->caller = _caller;
909: p->count = 1;
910: p->link = callee->callers;
911: callee->callers = p;
912: }
913: _caller = 0;
914: }
915: \end{verbatim}
916:
917: Profiling can be restricted to only those files of interest.
918: The counts printed by \verb|bprint| will be correct,
919: but some edges may be omitted from the call graph.
920: For example, if \verb|f| calls \verb|g| calls \verb|h|
921: and \verb|f| and \verb|h| are compiled with profiling, but \verb|g| is not,
922: \verb|bprint| will report that \verb|f| called \verb|h|.
923: The total number of calls to each function is correct, however.
924:
925: \section{Performance}
926:
927: \verb|lcc| emits local code that is comparable
928: to that emitted by the generally available alternatives.
929: Table~\ref{spec:execution} summarizes the results of compiling
930: and executing the C programs in the SPEC benchmarks~\cite{spec89}
931: with three compilers on the four machines listed above.
932: Configuration details are listed with each machine.
933: \verb|cc| and \verb|gcc| denote, respectively,
934: the manufacturer's C compiler and the
935: GNU~C compiler from the Free Software Foundation.
936: The times are elapsed time in seconds and are the lowest
937: elapsed times over several runs on lightly loaded machines.
938: All reported runs achieved at least 97 percent utilization (i.e., the ratio of times
939: $({\it user} + {\it system})/{\it elapsed} \ge 0.97$).
940:
941: The entries with \verb|-O| indicate compilation with the
942: ``default'' optimization, which often includes some global optimizations.
943: \verb|lcc| performs no global optimizations.
944: The \verb|gcc| and \verb|gcc -O| figures for {\tt gcc1.35} on the MIPS
945: are missing because this benchmark did not execute correctly
946: when compiled with \verb|gcc|.
947:
948: \newcommand{\0}{\hspace{0.5em}}
949: \begin{table}
950: \begin{center}
951: \begin{tabular}{lcccc}
952: &\multicolumn{4}{c}{\it benchmark}\\
953: \it compiler & 1. \tt gcc1.35& 8. \tt espresso& 22. \tt li & 23. \tt eqntott \\ \hline
954: %\makebox[0pt][l]{VAX: VAX 8650 w/36MB running 4.3BSD UNIX} \\% % megaron
955: % v1.3
956: %\tt lcc & 345 & 588 & 1315 & 296 \\
957: %\tt cc & 350 & 504 & 1471 & 334 \\
958: %\tt gcc & 304 & 501 & 1316 & 327 \\ %v1.36
959: %\tt cc -O & 320 & 525 & 1486 & 325 \\
960: %\tt gcc -O & 283 & 466 & 1272 & 201 \\[2ex]
961: \\[-.5ex]
962: \makebox[0pt][l]{VAX: \small MicroVAX II w/16MB running Ultrix 3.1} \\% v1.3; ffserver
963: \tt lcc & 1734 & 2708 & 7015 & 3532 \\
964: \tt cc & 1824 & 2782 & 7765 & 3569 \\
965: \tt gcc & 1439 & 2757 & 7353 & 3263 \\ %v1.36
966: \tt cc -O & 1661 & 2653 & 7086 & 3757 \\
967: \tt gcc -O & 1274 & 2291 & 6397 & 1131 \\[2ex]
968:
969: \makebox[0pt][l]{68020: \small Sun 3/60 w/24MB running SunOS 4.0.3} \\% v1.3; hemlock
970: \tt lcc & 544 & 1070 & 2591 & 567 \\
971: \tt cc & 514 & 1005 & 3308 & 619 \\
972: \tt gcc & 426 & 1048 & 2498 & 591 \\ %v1.35
973: \tt cc -O & 428 &\0882 & 2237 & 571 \\
974: \tt gcc -O & 337 &\0834 & 1951 & 326 \\[2ex]
975:
976: \makebox[0pt][l]{MIPS: \small IRIS 4D/220GTX w/32MB running IRIX 3.3.1} \\% v1.3; oyoy
977: \tt lcc & 116 & 150 & 352 & 111 \\
978: \tt cc & 107 & 153 & 338 & 100 \\
979: \tt gcc & & 188 & 502 & 132 \\ %v1.36
980: \tt cc -O &\092 & 130 & 299 &\070 \\
981: \tt gcc -O & & 145 & 411 & 112 \\[2ex]
982:
983: %\makebox[0pt][l]{SPARC:\hspace{1em}}% phoenix: Sun 4/490 w/128MB, SunOS 4.1
984: %\tt lcc & 106 & 218 & 436 & 129 \\
985: %\tt cc &\099 & 189 & 589 & 133 \\
986: %\tt gcc &\094 & 205 & 589 & 131 \\ %v1.36
987: %\tt cc -O &\078 & 154 & 351 &\089 \\
988: %\tt gcc -O &\064 & 153 & 384 &\092 \\
989: %
990: \makebox[0pt][l]{SPARC: \small Sun 4/260 w/32MB running SunOS 4.0.3}\\% v1.3; python
991: \tt lcc & 196 & 370 &\0790 & 209 \\
992: \tt cc & 203 & 381 & 1094 & 275 \\
993: \tt gcc & 186 & 411 & 1139 & 256 \\ %v1.34
994: \tt cc -O & 150 & 296 &\0707 & 183 \\
995: \tt gcc -O & 127 & 309 &\0788 & 179 \\
996: \end{tabular}
997: \end{center}
998: \caption{Execution Time for C SPEC Benchmarks in Seconds.\label{spec:execution}}
999: \end{table}
1000:
1001: \verb|lcc| is faster than many (but not all~\cite{thompson90}) other C compilers.
1002: Table~\ref{spec:compilation} parallels Table~\ref{spec:execution},
1003: but shows compilation time instead of execution time.
1004: Except for the MIPS, the times are for running only the compiler proper;
1005: preprocessing, assembly, and linking time are not included.
1006: Two times are given for the MIPS because the manufacturer's \verb|cc| front end
1007: consists of two programs; the first translates C to ``u-code'' and the
1008: second generates object code. Generating assembly language
1009: costs {\em more} than generating object code, so Table~\ref{spec:compilation} gives
1010: both times for all compilers.
1011: The last row in Table~\ref{spec:compilation} lists the number of non-blank
1012: lines and the total number of lines in each benchmark {\em after} preprocessing.
1013:
1014: \begin{table}
1015: \begin{center}
1016: \begin{tabular}{lcccc}
1017: &\multicolumn{4}{c}{\it benchmark}\\
1018: \it compiler & 1. \tt gcc1.35& 8. \tt espresso& 22. \tt li & 23. \tt eqntott \\ \hline
1019: %\makebox[0pt][r]{VAX:\hspace{1em}}% megaron
1020: %\tt lcc & 189 &\059 & 31 &\09 \\
1021: %\tt cc & 405 & 114 & 40 & 26 \\
1022: %\tt gcc & 505 & 222 & 74 & 45
1023: \\[-.5ex]
1024: \makebox[0pt][l]{VAX:}\\% v1.3; ffserver
1025: \tt lcc &\0792 & 237 &\069 & 36 \\
1026: \tt cc & 1878 & 576 & 174 & 79 \\
1027: \tt gcc & 1910 & 637 & 192 & 86 \\[2ex]
1028:
1029: \makebox[0pt][l]{68020:}\\% v1.3; hemlock
1030: \tt lcc & 302 &\090 & 28 & 15 \\
1031: \tt cc & 507 & 168 & 52 & 29 \\
1032: \tt gcc & 599 & 196 & 56 & 27 \\[2ex]
1033:
1034: \makebox[0pt][l]{MIPS:}\\% v1.3; oyoy
1035: \tt lcc &\097 195 &\035 \063 & 10 24 &\06 16\\
1036: \tt cc & 318 177 & 104 \068 & 40 26 & 24 19 \\
1037: \tt gcc & 320 391 &\088 118 & 28 42 & 13 24 \\[2ex]
1038:
1039: \makebox[0pt][l]{SPARC:}\\% v1.3; python
1040: \tt lcc & 103 &\038 & 12 & \08 \\
1041: \tt cc & 175 &\060 & 18 & 11 \\
1042: \tt gcc & 313 & 100 & 31 & 16 \\[2ex]
1043:
1044: line counts & 79102/250496 & 25717/58516 & 7070/22494 & 2680/6569 \\
1045: \end{tabular}
1046: \end{center}
1047: \caption{Compilation Time for C SPEC Benchmarks in Seconds.\label{spec:compilation}}
1048: \end{table}
1049:
1050: \verb|lcc| is smaller than other compilers.
1051: Table~\ref{sizes} lists the sizes of the three compilers in kilobytes.
1052: Each entry is the sum of sizes of the program and data segments
1053: for the indicated compiler as reported by the UNIX \verb|size| command.
1054:
1055: \begin{table}
1056: \begin{center}
1057: \begin{tabular}{lcccc}
1058: \it compiler & VAX & 68020 & MIPS & SPARC \\ \hline
1059: \tt lcc & 181 & 244 & 280 & 276 \\ %v1.3
1060: \tt cc & 256 & 306 & 616 & 402 \\
1061: \tt gcc & 378 & 507 & 777 & 689 \\
1062: \end{tabular}
1063: \end{center}
1064: \caption{Sizes of Compiler Executables in Kilobytes.\label{sizes}}
1065: \end{table}
1066:
1067: %\bibliography{refs,lib}
1068: \begin{thebibliography}{10}
1069:
1070: \bibitem{aho:sethi:ullman:86}
1071: A.~V. Aho, R.~Sethi, and J.~D. Ullman.
1072: \newblock {\em Compilers: Principles, Techniques, and Tools}.
1073: \newblock Addison Wesley, Reading, MA, 1986.
1074:
1075: \bibitem{ansi:Cstandard}
1076: American National Standard Institute, Inc., New York.
1077: \newblock {\em American National Standards for Information Systems, Programming
1078: Language C {ANSI} X3.159--1989}, 1990.
1079:
1080: \bibitem{baskett78}
1081: F.~Baskett.
1082: \newblock The best simple code generation technique for while, for, and do
1083: loops.
1084: \newblock {\em SIGPLAN Notices}, 13(4):31--32, Apr. 1978.
1085:
1086: \bibitem{bentley82}
1087: J.~L. Bentley.
1088: \newblock {\em Writing Efficient Programs}.
1089: \newblock Prentice Hall, Englewood Cliffs, NJ, 1982.
1090:
1091: \bibitem{bernstein85}
1092: R.~L. Bernstein.
1093: \newblock Producing good code for the case statement.
1094: \newblock {\em Software---Practice \& Experience}, 15(10):1021--1024, Oct.
1095: 1985.
1096:
1097: \bibitem{bsd84}
1098: Computer Science Division, Department of Electrical Engineering and Computer
1099: Science, University of California, Berkeley, CA.
1100: \newblock {\em UNIX User's Manual, Reference Guide}, virtual {VAX}-11 version
1101: edition, Mar. 1984.
1102:
1103: \bibitem{fraser:sigplan89}
1104: C.~W. Fraser.
1105: \newblock A language for writing code generators.
1106: \newblock {\em Proceedings of the SIGPLAN'89 Conference on Programming Language
1107: Design and Implementation, SIGPLAN Notices}, 24(7):238--245, July 1989.
1108:
1109: \bibitem{fraser:hanson:91a}
1110: C.~W. Fraser and D.~R. Hanson.
1111: \newblock A code generation interface for {ANSI C}.
1112: \newblock {\em Software---Practice \& Experience}, 21(9):963--988, Sept. 1991.
1113:
1114: \bibitem{fraser:hanson:92}
1115: C.~W. Fraser and D.~R. Hanson.
1116: \newblock Simple register spilling in a retargetable compiler.
1117: \newblock {\em Software---Practice \& Experience}, 22(1):85--99, Jan. 1992.
1118:
1119: \bibitem{graham:kessler:mckusick:83}
1120: S.~L. Graham, P.~B. Kessler, and M.~K. McKusick.
1121: \newblock An execution profiler for modular programs.
1122: \newblock {\em Software---Practice \& Experience}, 13(8):671--685, Aug. 1983.
1123:
1124: \bibitem{griswold:griswold:90}
1125: R.~E. Griswold and M.~T. Griswold.
1126: \newblock {\em The Icon Programming Language}.
1127: \newblock Prentice Hall, Englewood Cliffs, NJ, second edition, 1990.
1128:
1129: \bibitem{hanson83}
1130: D.~R. Hanson.
1131: \newblock Simple code optimizations.
1132: \newblock {\em Software---Practice \& Experience}, 13(8):745--763, Aug. 1983.
1133:
1134: \bibitem{hanson90}
1135: D.~R. Hanson.
1136: \newblock Fast allocation and deallocation of memory based on object lifetimes.
1137: \newblock {\em Software---Practice \& Experience}, 20(1):5--12, Jan. 1990.
1138:
1139: \bibitem{kernighan:ritchie:88}
1140: B.~W. Kernighan and D.~M. Ritchie.
1141: \newblock {\em The C Programming Language}.
1142: \newblock Prentice Hall, Englewood Cliffs, NJ, second edition, 1988.
1143:
1144: \bibitem{knuth71}
1145: D.~E. Knuth.
1146: \newblock An empirical study of {FORTRAN} programs.
1147: \newblock {\em Software---Practice \& Experience}, 1(2):105--133, Apr. 1971.
1148:
1149: \bibitem{sedgewick88}
1150: R.~Sedgewick.
1151: \newblock {\em Algorithms}.
1152: \newblock Addison Wesley, Reading, MA, 1988.
1153:
1154: \bibitem{sites78}
1155: R.~L. Sites.
1156: \newblock Programming tools: Statement counts and procedure timings.
1157: \newblock {\em SIGPLAN Notices}, 13(12):98--101, Dec. 1978.
1158:
1159: \bibitem{spec89}
1160: Standards Performance Evaluation Corp.
1161: \newblock {\em {SPEC} Benchmark Suite Release 1.0}, Oct. 1989.
1162:
1163: \bibitem{thompson90}
1164: K.~Thompson.
1165: \newblock A new {C} compiler.
1166: \newblock In {\em Proceedings of the Summer 1990 UKUUG Conference}, pages
1167: 41--51, London, July 1990.
1168:
1169: \bibitem{waite86}
1170: W.~M. Waite.
1171: \newblock The cost of lexical analysis.
1172: \newblock {\em Software---Practice \& Experience}, 16(5):473--488, May 1986.
1173:
1174: \bibitem{weinberger84}
1175: P.~J. Weinberger.
1176: \newblock Cheap dynamic instruction counting.
1177: \newblock {\em Bell System Technical Journal}, 63(8):1815--1826, Oct. 1984.
1178:
1179: \end{thebibliography}
1180:
1181: \end{document}
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.