Annotation of researchv10no/cmd/lcc/doc/overview.tex, revision 1.1.1.1

1.1       root        1: \documentstyle{article}
                      2: 
                      3: \begin{titlepage}
                      4: \normalsize
                      5: \vspace*{.7in}
                      6: \begin{center}
                      7: \begin{tabular}{c}
                      8: \bf\Large A Retargetable Compiler for ANSI C \\[.5in]
                      9: 
                     10: Christopher W. Fraser \\
                     11: \em AT\&T Bell Laboratories, 600 Mountain Avenue, \\
                     12: \em Murray Hill, NJ 07974 \\[1ex]
                     13: and \\[1ex]
                     14: David R. Hanson \\
                     15: \em Department of Computer Science, Princeton University, \\
                     16: \em Princeton, NJ 08544 \\[.7in]
                     17: 
                     18: Research Report CS-TR-303-91 \\[1ex]
                     19: February 1991 \\[.5in]
                     20: 
                     21: \end{tabular}
                     22: \end{center}
                     23: 
                     24: \begin{abstract}
                     25: \normalsize
                     26: \verb|lcc| is a new retargetable compiler for ANSI C.
                     27: Versions for the VAX, Motorola 68020, SPARC, and MIPS
                     28: are in production use at Princeton University
                     29: and at AT\&T Bell Laboratories.
                     30: With a few exceptions,
                     31: little about \verb|lcc| is unusual --- it integrates
                     32: several well engineered, existing techniques ---
                     33: but it is smaller and faster than most other C compilers,
                     34: and it generates code of comparable quality.
                     35: \verb|lcc|'s target-independent front end performs a
                     36: few simple, but effective, optimizations that contribute
                     37: to good code; examples include simulating register declarations
                     38: and partitioning switch statement cases into dense tables.
                     39: It also implements target-independent function tracing
                     40: and expression-level profiling.
                     41: \end{abstract}
                     42: 
                     43: \end{titlepage}
                     44: 
                     45: \begin{document}
                     46: %\bibliographystyle{abbrv}
                     47: 
                     48: \setcounter{secnumdepth}{0}
                     49: 
                     50: \section{Introduction}
                     51: 
                     52: \verb|lcc| is a new retargetable compiler for ANSI C~\cite{ansi:Cstandard}.
                     53: It has been ported to the VAX, Motorola 68020, SPARC, and MIPS R3000,
                     54: and it is in production use at Princeton University
                     55: and at AT\&T Bell Laboratories.
                     56: When used with a compliant preprocessor and library,
                     57: \verb|lcc| passes the conformance section of
                     58: Version 2.00 of the Plum-Hall Validation Suite for ANSI~C.%
                     59: \footnote{The {\tt lcc} front end and a sample code generator are available
                     60: for anonymous {\tt ftp} from {\tt princeton.edu}.
                     61: The file {\tt README} in the directory {\tt pub/lcc} gives details.
                     62: It also describes the current availability of {\tt lcc}'s
                     63: production code generators.}
                     64: 
                     65: Other reports describe \verb|lcc|'s storage manager~\cite{hanson90},
                     66: intermediate language~\cite{fraser:hanson:91a}, code
                     67: generator~\cite{fraser:sigplan89}, and register
                     68: manager~\cite{fraser:hanson:92}.  This report surveys the
                     69: remaining features of \verb|lcc| that may interest some readers.  Chief
                     70: among these are its performance, some aspects of its design that
                     71: support this performance, and some features for debugging and profiling
                     72: user code.
                     73: 
                     74: \section{Design}
                     75: 
                     76: With a few exceptions, \verb|lcc| uses
                     77: well established compiler techniques.
                     78: The front end performs lexical, syntactic, and semantic analysis, and
                     79: some machine-independent optimizations, which are described below.
                     80: Both the lexical analyzer and the recursive-descent parser are hand-written.
                     81: Theoretically, this approach complicates both future changes and fixing errors,
                     82: but accommodating change is less important for a standardized language like ANSI~C,
                     83: and there have been few lexical or syntactic errors.
                     84: Indeed, less than 15~percent of \verb|lcc|'s code concerns parsing,
                     85: and the error rate in that code is negligible.
                     86: Despite its theoretical prominence,
                     87: parsing is a relatively minor component in \verb|lcc| as in other compilers;
                     88: semantic analysis, optimization, and code generation are the major components
                     89: and account for most of the code and most of the errors.
                     90: 
                     91: The target-independent front end and a target-dependent back end are
                     92: packaged as single program, tightly coupled by a
                     93: compact, efficient interface.
                     94: The interface consists of a few shared data structures,
                     95: 17 functions, and a 36-operator dag language.
                     96: The functions emit function prologues,
                     97: define globals, emit data, etc., and most are simple.
                     98: The dag language encodes the executable code from a source program; it
                     99: corresponds to the ``intermediate language'' used in other compilers, but
                    100: it is smaller than typical intermediate languages.
                    101: Reference~\cite{fraser:hanson:91a} describes the interface.
                    102: 
                    103: Code generators are generated automatically from compact, rule-based
                    104: specifications~\cite{fraser:sigplan89}.  Some rules rewrite
                    105: intermediate code as naive assembly code.  Others peephole-optimize the
                    106: result.  They are compiled into a monolithic hard-coded program that
                    107: accepts dags annotated with intermediate code, and generates,
                    108: optimizes, and emits code for the target machine.  Hard-coding
                    109: contributes significantly to \verb|lcc|'s speed.
                    110: 
                    111: Table~\ref{linecounts} shows the number of lines
                    112: in each of \verb|lcc|'s components. The notation $h+c$ indicates
                    113: $h$ lines of definitions in ``header'' files and $c$ lines of C code.
                    114: The ``back-end support'' is back-end code that
                    115: shared by four back ends, e.g., initialization and
                    116: most of the register manager.
                    117: 
                    118: \begin{table}
                    119: \begin{center} % for v1.3
                    120: \newlength{\W}\settowidth{\W}{symbol table emitters\ }
                    121: \begin{tabular}{l|rcc}
                    122:                        &                       &               & \it generated \\
                    123: \hfil\it component     & \omit\hfil\it code\hfil& \it rules    & \it code generator \\ \hline
                    124: front end              & 968+7847 \\[2ex]
                    125: back-end support       & 114+741 \\
                    126: VAX                    & 35+170                & 178           & 5782 \\
                    127: MIPS                   & 40+378                & 104           & 2966 \\
                    128: 68020                  & 42+190                & 145           & 8301 \\
                    129: SPARC                  & 40+290                & 128           & 3888 \\[2ex]
                    130: \parbox{\W}{VAX+68020+SPARC\\[-2pt]
                    131: symbol table emitters} & 584 \\[2ex]
                    132: \parbox{\W}{naive VAX\\[-2pt]
                    133: code generator}                & 67+578 \\[2ex]
                    134: rule compiler          & 1285 \\
                    135: \end{tabular}
                    136: \end{center}
                    137: \caption{Number of Lines in {\tt lcc} Components.\label{linecounts}}
                    138: \end{table}
                    139: 
                    140: 
                    141: Target-specific files include a
                    142: configuration header file, which defines parameters like
                    143: the widths and alignments of the basic datatypes,
                    144: target-specific interface functions, e.g., those that emit function prologues,
                    145: and code generation rules, from which the code generators are generated
                    146: by the rule compiler, which is written in Icon~\cite{griswold:griswold:90}.
                    147: Retargeting \verb|lcc| requires involves writing these
                    148: three back-end components, which vary from 377 to 522 lines in existing back ends.
                    149: In practice, new back ends are implemented
                    150: by writing new rules and editing copies
                    151: of an existing configuration and set of interface functions.
                    152: 
                    153: All of \verb|lcc|'s production back ends use the technology
                    154: summarized above and detailed in Reference~\cite{fraser:sigplan89}.
                    155: The interface between the front and back end does not depend
                    156: on this technology; other back ends that conform to the interface
                    157: specification can be used. For example, Reference~\cite{fraser:hanson:91a}
                    158: details a hand-written code generator that emits naive VAX code.
                    159: 
                    160: While \verb|lcc| uses well established techniques, it uses some of their
                    161: more recent incarnations, each of which contributes to \verb|lcc|'s efficiency
                    162: as described below.
                    163: 
                    164: \subsection{Lexical Analysis}
                    165: 
                    166: The design of the input module and of
                    167: the lexical analyzer and judicious code tuning
                    168: of the lexical analyzer contribute to 
                    169: \verb|lcc|'s speed.
                    170: 
                    171: Input and lexical analysis use
                    172: variations of the design described in Reference~\cite{waite86}.
                    173: Since the lexical analyzer is the only module that
                    174: inspects every input character, the design
                    175: avoids extraneous per-character processing and
                    176: minimizes character movement by scanning tokens directly out
                    177: of a large input buffer.
                    178: 
                    179: Input is read directly from the operating system
                    180: into a 4096-character buffer as depicted
                    181: in Figure~\ref{fig:buffering}a, and \verb|cp| and \verb|limit|
                    182: delimit the unscanned portion of the buffer.
                    183: The next token is scanned by advancing \verb|cp| across
                    184: white space and switching on the first character of
                    185: the token, \verb|*cp|. \verb|cp| is advanced as the token is recognized.
                    186: 
                    187: \begin{figure}
                    188: \begin{center}
                    189: \setlength{\unitlength}{10pt}
                    190: \begin{picture}(26.3,21)
                    191: \put(0,2){
                    192: \begin{picture}(26.3,7)
                    193: \thicklines
                    194: \put( 0, 3){\framebox(26.3, 2)[r]{\tt \char`\\n}}
                    195: \thinlines
                    196: \put( 5, 3){\line(0,1){2}}
                    197: \put(25, 3){\line(0,1){2}}
                    198: \put( 3, 1){\vector(0,1){2}}\put(3, 0){\makebox(0,1){\tt cp}}
                    199: \put(25.7, 1){\vector(0,1){2}}\put(25.7, 0){\makebox(0,1){\tt limit}}
                    200: \put(22, 1){\vector(0,1){2}}\put(22, 0){\makebox(0,1){\tt fence}}
                    201: \put(15, 6){\makebox(0,0){4096 characters}}
                    202: \put( 5, 5.5){\line(0,1){1}}
                    203: \put(25, 5.5){\line(0,1){1}}
                    204: \put(11, 6){\vector(-1,0){6}}
                    205: \put(19, 6){\vector( 1,0){6}}
                    206: \end{picture}
                    207: }
                    208: \put(13.15,0){\makebox(0,1){(b) After a \tt read}}
                    209: 
                    210: \put(0,14){
                    211: \begin{picture}(26.3,7)
                    212: \thicklines
                    213: \put( 0, 3){\framebox(26.3, 2)[r]{\tt \char`\\n}}
                    214: \thinlines
                    215: \put( 5, 3){\line(0,1){2}}
                    216: \put(25, 3){\line(0,1){2}}
                    217: \put(10, 1){\vector(0,1){2}}\put(10, 0){\makebox(0,1){\tt cp}}
                    218: \put(25.7, 1){\vector(0,1){2}}\put(25.7, 0){\makebox(0,1){\tt limit}}
                    219: \put(22, 1){\vector(0,1){2}}\put(22, 0){\makebox(0,1){\tt fence}}
                    220: \put(15, 6){\makebox(0,0){4096 characters}}
                    221: \put( 5, 5.5){\line(0,1){1}}
                    222: \put(25, 5.5){\line(0,1){1}}
                    223: \put(11, 6){\vector(-1,0){6}}
                    224: \put(19, 6){\vector( 1,0){6}}
                    225: \end{picture}
                    226: }
                    227: \put(13.15,12){\makebox(0,1){(a) While ${\tt cp} < {\tt fence}$}}
                    228: 
                    229: \end{picture}
                    230: \end{center}
                    231: \caption{Input Buffering.\label{fig:buffering}}
                    232: \end{figure}
                    233: 
                    234: Newlines, denoted by \verb|\n|, cannot occur within C tokens,
                    235: which explains the newline at \verb|*limit| shown in Figure~\ref{fig:buffering}.
                    236: This newline terminates a scan for any token
                    237: so a separate, per-character test for the end of the buffer is unnecessary.
                    238: When a newline is encountered, an input module function is called
                    239: to refill the input buffer,
                    240: if necessary, and to increment the line number.
                    241: 
                    242: ANSI~C stipulates a maximum line length of no less than 509,
                    243: but few compilers insist on a specific limit.
                    244: Tokens, however, can be limited to 32 characters;
                    245: string literals are an exception,
                    246: but they are handled as a special case.
                    247: 
                    248: In general, an input buffer ends with a partial token.
                    249: To insure that an entire token lies between
                    250: \verb|cp| and \verb|limit|, the end of the buffer is moved to the memory
                    251: locations {\em preceding} the buffer whenever
                    252: \verb|cp| passes \verb|fence|. Doing so concatenates
                    253: a partial token with its tail after the next read
                    254: as shown in Figure~\ref{fig:buffering}b.
                    255: Testing if \verb|cp| has passed \verb|fence| is done
                    256: for each token after \verb|cp| is advanced across white space.
                    257: 
                    258: The important consequence of this design is that most
                    259: of the input characters are accessed by \verb|*cp|
                    260: and many are never moved. Only identifiers (excluding keywords) and string
                    261: literals that appear in executable code
                    262: are copied out of the buffer into permanent storage.
                    263: 
                    264: Reference~\cite{waite86}'s algorithm moves partial {\em lines} instead
                    265: of partial tokens and does so after scanning the {\em first} newline in
                    266: the buffer.  But this operation overwrites storage before the buffer
                    267: when a partial line is longer than a fixed maximum.  The algorithm
                    268: above avoids this problem, but at the per-token cost of comparing
                    269: \verb|cp| with \verb|fence|.
                    270: 
                    271: Instead of actually using \verb|cp| as suggested above,
                    272: \verb|cp| is copied to the register variable \verb|rcp|
                    273: upon entry to the lexical analyzer, and \verb|rcp| is used in
                    274: token recognition. \verb|rcp| is assigned to \verb|cp|
                    275: before the lexical analyzer returns.
                    276: Using \verb|rcp| improves performance
                    277: and makes scanning loops compact and fast, e.g., white space is elided by
                    278: \begin{verbatim}
                    279: while (map[*rcp]&BLANK)
                    280:    rcp++;
                    281: \end{verbatim}
                    282: \verb|map[c]| is a mask that classifies character \verb|c| as suggested
                    283: in Reference~\cite{waite86}; e.g.,
                    284: \verb|map[c]&BLANK| is non-zero if \verb|c| is a
                    285: white-space character (but not a newline).
                    286: \verb|lcc| generates four VAX instructions for the body of this loop:
                    287: \begin{verbatim}
                    288:         jbr L142
                    289: L141:   incl r11
                    290: L142:   cvtbl (r11),r5
                    291:         bicl3 $-2,_map[r5],r5
                    292:         jneq L141
                    293: \end{verbatim}
                    294: \verb|rcp| is register \verb|r11|.
                    295: Some optimizing compilers can make similar improvements locally, but not across
                    296: potentially aliased assignments and calls to other, irrelevant functions.
                    297: 
                    298: Keywords are recognized by a hard-coded decision tree, e.g.,
                    299: \begin{verbatim}
                    300: case 'i':
                    301:    if (rcp[0] == 'f'
                    302:    && !(map[rcp[1]]&(DIGIT|LETTER))) {
                    303:       cp = rcp + 1;
                    304:       return IF;
                    305:    }
                    306:    if (rcp[0] == 'n'
                    307:    &&  rcp[1] == 't'
                    308:    && !(map[rcp[2]]&(DIGIT|LETTER))) {
                    309:       cp = rcp + 2;
                    310:       return INT;
                    311:    }
                    312:    goto id;
                    313: \end{verbatim}
                    314: \verb|IF| and \verb|INT| are defined as the token codes for
                    315: the keywords \verb|if| and \verb|int|, respectively, and
                    316: \verb|id| labels the code that scans identifiers.
                    317: This code is generated automatically by a 50-line C program
                    318: and included in the lexical analyzer during compilation.
                    319: 
                    320: The VAX code generated for this fragment follows;
                    321: again, \verb|r11| is \verb|rcp|.
                    322: \begin{verbatim}
                    323: L347:   cmpb (r11),$102
                    324:         jneq L348
                    325:         cvtbl 1(r11),r5
                    326:         bicl3 $-13,_map[r5],r5
                    327:         jneq L348
                    328:         addl3 $1,r11,_cp
                    329:         movl $77,r0
                    330:         ret
                    331: L348:   cmpb (r11),$110
                    332:         jneq L226
                    333:         cmpb 1(r11),$116
                    334:         jneq L226
                    335:         cvtbl 2(r11),r5
                    336:         bicl3 $-13,_map[r5],r5
                    337:         jneq L226
                    338:         addl3 $2,r11,_cp
                    339:         movl $5,r0
                    340:         ret
                    341: \end{verbatim}
                    342: Thus, the keyword \verb|int| is recognized by less than
                    343: a dozen instructions, many less than are executed
                    344: when a table is searched for keywords, even if perfect hashing is used.
                    345: 
                    346: As in other compilers~\cite{aho:sethi:ullman:86},
                    347: strings that must be saved (identifiers and string literals)
                    348: are hashed into a table in which a string appears only once,
                    349: which saves space. For performance, there are variants for installing strings of digits
                    350: and strings of known length.
                    351: After installation, strings are known by their addresses
                    352: and the characters are accessed only for output.
                    353: For example, looking a name up in the symbol table is
                    354: done by hashing on the address of the name; string comparison is unnecessary.
                    355: 
                    356: \subsection{Symbol Tables}
                    357: 
                    358: Fast symbol table manipulation also contributes to
                    359: \verb|lcc|'s speed. It took several versions
                    360: of the symbol table module to arrive at the current one, however.
                    361: 
                    362: Symbols are represented with structures defined by
                    363: \begin{verbatim}
                    364: struct symbol {
                    365:    char *name;         /* symbol name */
                    366:    int scope;          /* scope level */
                    367:    ...
                    368: };
                    369: \end{verbatim}
                    370: The symbol table module uses hash
                    371: tables for symbol tables; the initial version used
                    372: a single table for all scopes, i.e.,
                    373: \begin{verbatim}
                    374: struct entry {
                    375:    struct symbol sym;  /* this symbol */
                    376:    struct entry *link; /* next entry on hash chain */
                    377: };
                    378: struct table {
                    379:    struct entry *buckets[HASHSIZE]; /* hash buckets */
                    380: };
                    381: \end{verbatim}
                    382: Symbols are wrapped in \verb|entry| structures to keep
                    383: the linkage information private to the symbol table module.
                    384: 
                    385: Scope entry required no code.
                    386: Each new symbol was added to the head of its hash chain
                    387: and thereby hid symbols with the same names,
                    388: which appeared further down on the same chains.
                    389: At scope exit, however, entries at the current
                    390: scope level, indicated by the value of \verb|level|,
                    391: were removed from the table \verb|*tp| by the code
                    392: \begin{verbatim}
                    393: for (i = 0; i < HASHSIZE; i++) {
                    394:    struct entry *p = tp->buckets[i];
                    395:    while (p && p->sym.scope == level)
                    396:       p = p->link;
                    397:    tp->buckets[i] = p;
                    398: }
                    399: \end{verbatim}
                    400: Measurements revealed that this code accounted for over
                    401: 5~percent of \verb|lcc|'s execution time on typical input.
                    402: This code scanned the hash buckets even for scopes that
                    403: introduce no new symbols, which are common in C.
                    404: 
                    405: The second version of the symbol table module used
                    406: a separate hash table for each scope level:
                    407: \begin{verbatim}
                    408: struct table {
                    409:    struct table *previous; /* table at lower scope */
                    410:    struct entry *buckets[HASHSIZE]; /* hash buckets */
                    411: };
                    412: \end{verbatim}
                    413: Searching for a symbol took the same number
                    414: of comparisons, but also required a traversal
                    415: of the list of separate tables, e.g.,
                    416: \begin{verbatim}
                    417: struct symbol *lookup(char *name, struct table *tp) {
                    418:    struct entry *p;
                    419:    unsigned h = ((unsigned)name)&(HASHSIZE-1);
                    420: 
                    421:    do
                    422:       for (p = tp->buckets[h]; p; p = p->link)
                    423:          if (name == p->sym.name)
                    424:             return &p->sym;
                    425:    while (tp = tp->previous);
                    426:    return 0;
                    427: }
                    428: \end{verbatim}
                    429: Notice that symbol names are compared by simply
                    430: comparing addresses as explained in the previous section.
                    431: Despite the conventional wisdom about hashing functions~\cite{sedgewick88},
                    432: using a power of two for \verb|HASHSIZE| gave
                    433: better performance; using a prime instead and modulus in place
                    434: of masking slowed \verb|lcc|.
                    435: 
                    436: This variation reduced the scope exit code to
                    437: \begin{verbatim}
                    438: tp = tp->previous
                    439: \end{verbatim}
                    440: for table \verb|*tp|. Unfortunately, scope entry
                    441: then required allocation and initialization of a table:
                    442: \begin{verbatim}
                    443: struct table *new = (struct table *)alloc(sizeof *new);
                    444: new->previous = tp;
                    445: for (i = 0; i < HASHSIZE; i++)
                    446:    new->buckets[i] = 0;
                    447: tp = new;
                    448: \end{verbatim}
                    449: So, the time wasted at scope exit in the first version
                    450: was traded for a similar waste at scope entry in the second version.
                    451: 
                    452: The symbol table module in actual use avoids
                    453: this waste by lazy allocation and initialization
                    454: of tables. Tables include their associated scope level:
                    455: \begin{verbatim}
                    456: struct table {
                    457:    int level;              /* scope level for this table */
                    458:    struct table *previous; /* table at lower scope */
                    459:    struct entry *buckets[HASHSIZE]; /* hash buckets */
                    460: };
                    461: \end{verbatim}
                    462: New tables are allocated and initialized only
                    463: when a symbol is installed:
                    464: \begin{verbatim}
                    465: struct symbol *install(char *name, struct table **tpp) {
                    466:    unsigned h = ((unsigned)name)&(HASHSIZE-1);
                    467:    struct table *tp = *tpp;
                    468:    struct entry *p = (struct entry *)alloc(sizeof *p);
                    469: 
                    470:    if (tp->level < level) {
                    471:       int i;
                    472:       struct table *new = (struct table *)alloc(sizeof *new);
                    473:       new->previous = tp;
                    474:       new->level = level;
                    475:       for (i = 0; i < HASHSIZE; i++)
                    476:          new->buckets[i] = 0;
                    477:       *tpp = tp = new;
                    478:    }
                    479:    p->sym.name = name;
                    480:    p->sym.scope = tp->level;
                    481:    ...
                    482:    p->link = tp->buckets[h];
                    483:    tp->buckets[h] = p;
                    484:    return &p->sym;
                    485: }
                    486: \end{verbatim}
                    487: Since few scopes in C, which are delimited by compound statements,
                    488: declare new symbols, the lazy allocation code above is rarely
                    489: executed and entry to most scopes is nearly free.
                    490: The scope exit code must check before
                    491: discarding a table, but remains simple: 
                    492: \begin{verbatim}
                    493: if (tp->level == level)
                    494:    tp = tp->previous;
                    495: \end{verbatim}
                    496: 
                    497: This design also simplifies access to separate tables.
                    498: For example, the table that holds globals is at the end
                    499: of the list of identifier tables; by making it
                    500: the value of \verb|globals|, symbols can be
                    501: installed into it directly.
                    502: In the initial implementation, a global declared at a nested
                    503: scope had to be inserted in the middle of its hash chain.
                    504: 
                    505: \subsection{Storage Management}
                    506: 
                    507: Allocation and deallocation in early versions of \verb|lcc|
                    508: accounted for a significant portion of the total execution time.
                    509: Replacing the naive use of \verb|malloc| and \verb|free|
                    510: reduced total execution time by about 8--10~percent.
                    511: As detailed in Reference~\cite{hanson90},
                    512: allocation is based on the lifetime of the objects allocated,
                    513: and all objects with the same lifetime are freed at once.
                    514: 
                    515: This approach to storage management simplified \verb|lcc|'s code.
                    516: Initially, each object type had explicit deallocation code, perhaps
                    517: replicated at several points. Some of this code was intricate, e.g.,
                    518: involving complex loops or recursive data structure traversals.
                    519: Allocation incurred an obligation to provide the necessary
                    520: deallocation code, so there was a tendency to use
                    521: algorithms that avoided allocation,
                    522: perhaps at the expense of time, complexity, and flexibility.
                    523: And it was easy to forget deallocation, resulting in storage leaks.
                    524: 
                    525: The current scheme eliminated nearly all explicit deallocation code,
                    526: which simplified the compiler and eliminated storage leaks.
                    527: More importantly, it encouraged the use of simple applicative algorithms,
                    528: e.g., in rewriting trees.
                    529: The replacements cost space, but not time, since
                    530: allocation and deallocation are nearly free.
                    531: Besides contributing to fast compilation, the other visible
                    532: benefit of this approach is that \verb|lcc| imposes few
                    533: arbitrary limits on its input; e.g., it permits
                    534: any number of cases in switch statements,
                    535: any number of parameters and locals, block nesting to any depth,
                    536: expressions of arbitrary complexity, initializations of arbitrary size, etc.
                    537: These quantities are limited only by the memory available.
                    538: 
                    539: 
                    540: \section{Optimization}
                    541: 
                    542: \verb|lcc| is not properly called an ``optimizing'' compiler
                    543: because it does no global optimization, {\em per se}.
                    544: Its front end does, however, perform some simple, target-independent
                    545: transformations that help its back ends generate good local code.
                    546: 
                    547: The front end eliminates local common subexpressions,
                    548: folds constant expressions, and makes numerous simple
                    549: transformations that improve the quality of local code~\cite{hanson83}.
                    550: Many of these improvements are simple tree transformations
                    551: that lead to better addressing code.
                    552: 
                    553: The front end lays out loops so as to reduce the number
                    554: of unconstructive branches~\cite{baskett78}, e.g., the code for
                    555: \begin{flushleft}
                    556: \tt for ($e_1$; $e_2$; $e_3$) $S$
                    557: \end{flushleft}
                    558: has the form
                    559: \begin{flushleft}\tt
                    560: \begin{tabular}{ll}
                    561:        & $e_1$ \\
                    562:        & goto L1 \\
                    563: L2:    & $S$ \\
                    564: L3:    & $e_3$ \\
                    565: L1:    & if ($e_2$) goto L2 \\
                    566: \end{tabular}
                    567: \end{flushleft}
                    568: The \verb|goto L1| is omitted if $e_2$ is initially non-zero.
                    569: In addition, the front end eliminates branch chains and dead branches.
                    570: 
                    571: The selection code for switch statements is generated entirely by the front end.
                    572: It generates a binary search of dense branch tables~\cite{bernstein85},
                    573: where the density is the percentage of non-default branch table entries.
                    574: For example, with the default density of 0.5, a switch statement
                    575: with the case values 1, 2, 6--8, 1001--1004, and 2001--2002 has the
                    576: following VAX selection code.
                    577: Register \verb|r4| holds the value of the switch expression,
                    578: \verb|L3|--\verb|15| label the statements for the case values above,
                    579: and \verb|L1| is the default label.
                    580: \begin{verbatim}
                    581:         cmpl r4,$1001
                    582:         jlss L17
                    583:         cmpl r4,$1004
                    584:         jgtr L16
                    585:         movl _18-4004[r4],r5
                    586:         jmp (r5)
                    587: _18:    .long L8
                    588:         .long L9
                    589:         .long L10
                    590:         .long L11
                    591: L17:    cmpl r4,$1
                    592:         jlss L1
                    593:         cmpl r4,$8
                    594:         jgtr L1
                    595:         movl _21-4[r4],r5
                    596:         jmp (r5)
                    597: _21:    .long L3
                    598:         .long L4
                    599:         .long L1
                    600:         .long L1
                    601:         .long L1
                    602:         .long L5
                    603:         .long L6
                    604:         .long L7
                    605: L16:    cmpl r4,$2001
                    606:         jlss L1
                    607:         cmpl r4,$2004
                    608:         jgtr L1
                    609:         movl _24-8004[r4],r5
                    610:         jmp (r5)
                    611: _24:    .long L12
                    612:         .long L13
                    613:         .long L14
                    614:         .long L15
                    615: \end{verbatim}
                    616: The density can be changed by a command-line option;
                    617: e.g., \verb|-d0| yields a single branch
                    618: table for each switch statement, and \verb|-d1| requires that all
                    619: branch tables be fully populated.
                    620: 
                    621: Finally, the front end simulates register declarations for
                    622: all scalar parameters and locals that are referenced at least
                    623: 3 times and do not have their addresses taken explicitly.
                    624: Locals are announced to the back ends with
                    625: explicitly declared \verb|register| locals followed by
                    626: the remaining locals in the order of decreasing frequency of use.
                    627: Each top-level occurrence of an identifier
                    628: counts as 1 reference. Occurrences in a loop,
                    629: either of the then/else arms of an if statement, or a case
                    630: in a switch statement each count, respectively, as 10, $1/2$, or $1/10$ references.
                    631: These values are adjusted to account for nested control structures.
                    632: The next section describes how these estimated counts
                    633: may be replaced with counts from an actual profile.
                    634: 
                    635: This scheme simplifies register assignment in the back ends,
                    636: and explicit \verb|register| declarations are rarely necessary.
                    637: For example,
                    638: \begin{verbatim}
                    639: strcpy(char *s1, char *s2) { while (*s1++ = *s2++); }
                    640: \end{verbatim}
                    641: yields the VAX code
                    642: \begin{verbatim}
                    643: _strcpy:.word 0x0
                    644:         movl 4(ap),r4
                    645:         movl 8(ap),r5
                    646: L26:    movb (r5)+,(r4)+
                    647:         jneq L26
                    648:         ret
                    649: \end{verbatim}
                    650: 
                    651: 
                    652: \section{Features}
                    653: 
                    654: \verb|lcc| provides a few noteworthy features that help users develop,
                    655: debug, and profile ANSI~C programs.
                    656: For example, an option causes
                    657: \verb|lcc| to print ANSI-style C declarations for all defined globals and functions.
                    658: For instance, the code (adapted from Section~6.2 of Reference~\cite{kernighan:ritchie:88})
                    659: \begin{verbatim}
                    660: typedef struct point { int x,y; } point;
                    661: typedef struct rect { point pt1, pt2; } rect;
                    662: 
                    663: point addpoint(p1, p2) point p1, p2; {
                    664:    p1.x += p2.x;
                    665:    p1.y += p2.y;
                    666:    return p1;
                    667: }
                    668: int ptinrect(p, r) point p; rect r; {
                    669:    return p.x >= r.pt1.x && p.x < r.pt2.x
                    670:       && p.y >= r.pt1.y && p.y < r.pt2.y;
                    671: }
                    672: \end{verbatim}
                    673: generates the declarations
                    674: \begin{verbatim}
                    675: extern point addpoint(point, point);
                    676: extern int ptinrect(point, rect);
                    677: \end{verbatim}
                    678: Editing such output can simplify conversion to ANSI~C.
                    679: 
                    680: Another option causes \verb|lcc| to
                    681: issue warnings for declarations and casts of function types without prototypes.
                    682: These include pointers to functions, which are easy to overlook
                    683: when updating pre-ANSI code. For example, it is likely
                    684: that \verb|char *(_alloc)()| should
                    685: be updated to be \verb|char *(_alloc)(size_t)|.
                    686: 
                    687: \subsection{Debugging}
                    688: 
                    689: \verb|lcc| supports the standard debugger symbol tables on VAXes and Suns.
                    690: It also has two options of its own to assist in program debugging.
                    691: 
                    692: Dereferencing zero pointers is a frequent C programming error.
                    693: On some systems, execution continues until the consequences
                    694: cause a fault somewhere unrelated to the actual point of error.
                    695: To help catch such errors, an option causes \verb|lcc|
                    696: to generate code to test for dereferencing zero pointers. If a zero pointer
                    697: is detected, the offending file name and line number are reported on the standard error, e.g.,
                    698: \begin{verbatim}
                    699: null pointer dereferenced @foo.c:36
                    700: \end{verbatim}
                    701: and the program terminates by calling the standard library function \verb|abort|.
                    702: 
                    703: Some languages provide built-in facilities for tracing function calls
                    704: and returns~\cite{griswold:griswold:90}.
                    705: An option instructs \verb|lcc| to generate calls to \verb|printf| (or a
                    706: user-specified equivalent) just after entry to each function
                    707: and just before each return.
                    708: The entry code prints the arguments and the return code prints the value
                    709: returned. For example, calling the
                    710: functions shown above would elicit messages like
                    711: \begin{verbatim}
                    712: addpoint#2(p1=(point){x=0,y=0},p2=(point){x=10,y=10}) called
                    713: addpoint#2 returned (point){x=10,y=10}
                    714: ...
                    715: ptinrect#1(p=(point){x=-1,y=-1},
                    716:    r=(rect){pt1=(point){x=10,y=10},pt2=(point){x=310,y=310}}) called
                    717: ptinrect#1 returned 0
                    718: \end{verbatim}
                    719: (Long lines have been folded to fit this page.)
                    720: As illustrated by this output,
                    721: the messages show the full details of the arguments, including structure contents.
                    722: The numbers that follow function names, e.g., \verb|#2|,
                    723: are activation numbers and can help locate a specific call and its return.
                    724: 
                    725: These debugging options are implemented entirely in the front end and
                    726: thus are available on all of \verb|lcc|'s targets.
                    727: 
                    728: \subsection{Profiling}
                    729: 
                    730: \verb|lcc| supports \verb|prof|-style (viz.~\cite[\verb|prof| command]{bsd84})
                    731: and \verb|gprof|-style~\cite{graham:kessler:mckusick:83} execution profiling
                    732: on VAXes and Suns.
                    733: These profilers sample the location counter periodically
                    734: to obtain an estimate of the percentage of total execution time
                    735: spent in each function, and they report the number of calls to each function.
                    736: 
                    737: Heeding long-standing advice~\cite{knuth71,sites78}, \verb|lcc| also supports
                    738: frequency-based profiling.
                    739: An option causes \verb|lcc| to emit counters that record the number
                    740: of times each {\em expression} is executed, and the values of these counters are
                    741: written to the file \verb|prof.out| when the program terminates.
                    742: A companion program, \verb|bprint|, reads \verb|prof.out| and prints
                    743: the source code annotated with execution counts, e.g.,
                    744: \begin{verbatim}
                    745: ...
                    746: 4  main()
                    747: 5  <1>{
                    748: ...
                    749: 12    <1>queens(0);
                    750: 13    return <1>0;
                    751: 14 <1>}
                    752: 15 
                    753: 16 queens(c)
                    754: 17 <1965>{
                    755: 18    int r;
                    756: 19 
                    757: 20    for (<1965>r = 0; <15720>r < 8; <15720>r++)
                    758: 21       if (<15720>rows[r] && <5508>up[r-c+7] && <3420>down[r+c]){
                    759: 22          <2056>rows[r] = up[r-c+7] = down[r+c] = 0;
                    760: 23          <2056>x[c] = r;
                    761: 24          if (<2056>c == 7)
                    762: 25             <92>print();
                    763: 26          else
                    764: 27             <1964>queens(c + 1);
                    765: 28          <2056>rows[r] = up[r-c+7] = down[r+c] = 1;
                    766: 29       }
                    767: 30 <1965>}
                    768: ...
                    769: \end{verbatim}
                    770: Execution counts are enclosed in angle brackets.
                    771: The counts on the outermost braces for \verb|queens|
                    772: give the number of calls.
                    773: Line 21 shows the benefit of associating a count
                    774: with each expression instead of each line;
                    775: the counts reveal that \verb|up[r-c+7]| was tested only slightly more than one-third
                    776: of the number of times the if statement was executed.
                    777: Conditional expressions are annotated similarly.
                    778: 
                    779: Users sometimes report an ``off-by-one'' bug when they see that
                    780: \verb|r < 8| in line 20 was executed
                    781: the same number of times as \verb|r++|.
                    782: These counts are a consequence of the way \verb|lcc| lays out for loops
                    783: and eliminates the test before the first iteration, as described above.
                    784: 
                    785: Data in \verb|prof.out| accumulates, so it is possible to execute
                    786: a program repeatedly and then have \verb|bprint| display the
                    787: cumulative frequencies.
                    788: This method is particularly useful for developing
                    789: test data that exercises all parts of a program:
                    790: \verb|<0>| highlights untested code.
                    791: 
                    792: Another option causes \verb|lcc| to read \verb|prof.out|
                    793: and use the counts therein to compute the frequency of use
                    794: of each identifier instead of using the estimates
                    795: described in the previous section. Doing so may reduce the
                    796: number of uses for identifiers that appear in
                    797: loops that rarely executed more than once,
                    798: and increase the number of uses for those that appear
                    799: in then/else arms that are executed most of the time.
                    800: 
                    801: Complex preprocessor macros can obscure \verb|bprint|'s presentation.
                    802: It necessarily uses post-expansion source coordinates
                    803: to annotate pre-expansion source files.
                    804: 
                    805: Profiling code also records the number of calls made
                    806: from each call site, which can be used to reconstruct the dynamic
                    807: call graph. \verb|bprint| prints a line for each edge, e.g.,
                    808: \begin{verbatim}
                    809: 1       queens  from main       in 8q.c:12.8
                    810: 1964    queens  from queens     in 8q.c:27.11
                    811: 92      print   from queens     in 8q.c:25.10
                    812: \end{verbatim}
                    813: This output shows that all but one of the calls to \verb|queens| was from
                    814: the call at character 11 in line 27.
                    815: This kind of data is particularly helpful in identifying
                    816: hot spots that are caused by inappropriate calls to a function
                    817: instead of inefficiencies within the function itself.
                    818: Such data can also help identify functions that might profitably
                    819: be replaced with two functions so that one can handle
                    820: the common case more efficiently~\cite[Sec.~5.3]{bentley82}.
                    821: 
                    822: Expression execution frequency profiling is implemented entirely by the
                    823: front end. The only machine dependency is the name of the ultimate
                    824: termination function in the revised \verb|exit| function that writes
                    825: \verb|prof.out| at program termination.
                    826: 
                    827: The implementation is a machine-independent variation of the method
                    828: described in Reference~\cite{weinberger84}.  The front end generates an
                    829: array of counters for each file and starts each expression with code to
                    830: increment the appropriate counter. In also builds a parallel array that
                    831: holds the source coordinates corresponding to each counter.  At the
                    832: entry point of each function, the front end generates the equivalent of
                    833: \begin{verbatim}
                    834: if (!_yylink.link) {
                    835:    extern struct _bbdata *_bblist;
                    836:    _yylink.link = _bblist;
                    837:    _bblist = &yylink;
                    838: }
                    839: _prologue(&callee);
                    840: \end{verbatim}
                    841: A \verb|_bbdata| structure is generated for each file:
                    842: \begin{verbatim}
                    843: static struct _bbdata {
                    844:    struct _bbdata *link;
                    845:    unsigned npoints;
                    846:    unsigned *counts;
                    847:    unsigned *coords;
                    848:    struct func *funcs;
                    849: } _yylink;
                    850: \end{verbatim}
                    851: The \verb|counts| and \verb|coords| fields point the arrays mentioned above,
                    852: which each have \verb|npoints| entries.
                    853: The entry point code uses the \verb|link| field to
                    854: add each file's \verb|_bbdata| structure
                    855: to the list headed by \verb|_bblist|, which the revised \verb|exit| function walks
                    856: to emit \verb|prof.out|.
                    857: 
                    858: \verb|_prologue| accumulates the dynamic call graph.
                    859: It is passed one of the \verb|func| structures --- one for each function --- that appear 
                    860: on the list emanating from \verb|_yylink.funcs|:
                    861: \begin{verbatim}
                    862: struct func {
                    863:    struct func *link;
                    864:    struct caller {
                    865:       struct caller *link;
                    866:       struct callsite *caller;
                    867:       unsigned count;
                    868:    } *callers;
                    869:    char *name;
                    870:    unsigned coord;
                    871: };
                    872: \end{verbatim}
                    873: The \verb|name| and \verb|coord| fields give the function's name and beginning source
                    874: coordinate, respectively.
                    875: \verb|callers| points to a list of \verb|caller| structures, one for each
                    876: call site. Each \verb|caller| structure records the number of calls
                    877: from the caller's \verb|callsite|:
                    878: \begin{verbatim}
                    879: struct callsite {
                    880:    char *file;
                    881:    char *name;
                    882:    unsigned coord;
                    883: };
                    884: \end{verbatim}
                    885: \verb|caller| structures are allocated at execution time
                    886: and point to \verb|callsite|s, which are generated by the front end
                    887: at compile time. 
                    888: 
                    889: Just before each call, the front end generates an assignment
                    890: of a pointer to a \verb|callsite| structure to the global variable \verb|_caller|.
                    891: \verb|_prologue| uses \verb|_caller| to record an edge in the dynamic call graph.
                    892: If a record of the caller already exists, its count is simply incremented.
                    893: Otherwise, a \verb|caller| structure is allocated and prefixed to
                    894: the callee's list of callers.
                    895: \begin{verbatim}
                    896: _prologue(struct func *callee) {
                    897:    static struct caller callers[4096];
                    898:    static int next;
                    899:    struct caller *p;
                    900: 
                    901:    for (p = callee->callers; p; p = p->link)
                    902:       if (p->caller == _caller) {
                    903:          p->count++;
                    904:          break;
                    905:       }
                    906:    if (!p && next < sizeof callers/sizeof callers[0]) {
                    907:       p = &callers[next++];
                    908:       p->caller = _caller;
                    909:       p->count = 1;
                    910:       p->link = callee->callers;
                    911:       callee->callers = p;
                    912:    }
                    913:    _caller = 0;
                    914: }
                    915: \end{verbatim}
                    916: 
                    917: Profiling can be restricted to only those files of interest.
                    918: The counts printed by \verb|bprint| will be correct,
                    919: but some edges may be omitted from the call graph.
                    920: For example, if \verb|f| calls \verb|g| calls \verb|h|
                    921: and \verb|f| and \verb|h| are compiled with profiling, but \verb|g| is not,
                    922: \verb|bprint| will report that \verb|f| called \verb|h|.
                    923: The total number of calls to each function is correct, however.
                    924: 
                    925: \section{Performance}
                    926: 
                    927: \verb|lcc| emits local code that is comparable
                    928: to that emitted by the generally available alternatives.
                    929: Table~\ref{spec:execution} summarizes the results of compiling
                    930: and executing the C programs in the SPEC benchmarks~\cite{spec89}
                    931: with three compilers on the four machines listed above.
                    932: Configuration details are listed with each machine.
                    933: \verb|cc| and \verb|gcc| denote, respectively,
                    934: the manufacturer's C compiler and the
                    935: GNU~C compiler from the Free Software Foundation.
                    936: The times are elapsed time in seconds and are the lowest
                    937: elapsed times over several runs on lightly loaded machines.
                    938: All reported runs achieved at least 97 percent utilization (i.e., the ratio of times
                    939: $({\it user} + {\it system})/{\it elapsed} \ge 0.97$).
                    940: 
                    941: The entries with \verb|-O| indicate compilation with the
                    942: ``default'' optimization, which often includes some global optimizations.
                    943: \verb|lcc| performs no global optimizations.
                    944: The \verb|gcc| and \verb|gcc -O| figures for {\tt gcc1.35} on the MIPS
                    945: are missing because this benchmark did not execute correctly
                    946: when compiled with \verb|gcc|. 
                    947: 
                    948: \newcommand{\0}{\hspace{0.5em}}
                    949: \begin{table}
                    950: \begin{center}
                    951: \begin{tabular}{lcccc}
                    952:                &\multicolumn{4}{c}{\it benchmark}\\
                    953: \it compiler   & 1. \tt gcc1.35& 8. \tt espresso& 22. \tt li   & 23. \tt eqntott \\ \hline
                    954: %\makebox[0pt][l]{VAX: VAX 8650 w/36MB running 4.3BSD UNIX} \\% % megaron
                    955: %                                                               v1.3
                    956: %\tt lcc       & 345           & 588           & 1315          & 296 \\
                    957: %\tt cc                & 350           & 504           & 1471          & 334 \\
                    958: %\tt gcc       & 304           & 501           & 1316          & 327 \\ %v1.36
                    959: %\tt cc -O     & 320           & 525           & 1486          & 325 \\
                    960: %\tt gcc -O    & 283           & 466           & 1272          & 201 \\[2ex]
                    961: \\[-.5ex]
                    962: \makebox[0pt][l]{VAX: \small MicroVAX II w/16MB running Ultrix 3.1} \\% v1.3; ffserver
                    963: \tt lcc                & 1734          & 2708          & 7015          & 3532 \\
                    964: \tt cc         & 1824          & 2782          & 7765          & 3569 \\
                    965: \tt gcc                & 1439          & 2757          & 7353          & 3263 \\ %v1.36
                    966: \tt cc -O      & 1661          & 2653          & 7086          & 3757 \\
                    967: \tt gcc -O     & 1274          & 2291          & 6397          & 1131 \\[2ex]
                    968: 
                    969: \makebox[0pt][l]{68020: \small Sun 3/60 w/24MB running SunOS 4.0.3} \\% v1.3; hemlock
                    970: \tt lcc                & 544           & 1070          & 2591          & 567 \\
                    971: \tt cc         & 514           & 1005          & 3308          & 619 \\
                    972: \tt gcc                & 426           & 1048          & 2498          & 591 \\ %v1.35
                    973: \tt cc -O      & 428           &\0882          & 2237          & 571 \\
                    974: \tt gcc -O     & 337           &\0834          & 1951          & 326 \\[2ex]
                    975: 
                    976: \makebox[0pt][l]{MIPS: \small IRIS 4D/220GTX w/32MB running IRIX 3.3.1} \\% v1.3; oyoy
                    977: \tt lcc                & 116           & 150           & 352           & 111 \\
                    978: \tt cc         & 107           & 153           & 338           & 100 \\
                    979: \tt gcc                &               & 188           & 502           & 132 \\ %v1.36
                    980: \tt cc -O      &\092           & 130           & 299           &\070  \\
                    981: \tt gcc -O     &               & 145           & 411           & 112 \\[2ex]
                    982: 
                    983: %\makebox[0pt][l]{SPARC:\hspace{1em}}% phoenix: Sun 4/490 w/128MB, SunOS 4.1
                    984: %\tt lcc       & 106           & 218           & 436           & 129 \\
                    985: %\tt cc                &\099           & 189           & 589           & 133 \\
                    986: %\tt gcc       &\094           & 205           & 589           & 131 \\ %v1.36
                    987: %\tt cc -O     &\078           & 154           & 351           &\089  \\
                    988: %\tt gcc -O    &\064           & 153           & 384           &\092  \\
                    989: %
                    990: \makebox[0pt][l]{SPARC: \small Sun 4/260 w/32MB running SunOS 4.0.3}\\% v1.3; python
                    991: \tt lcc                & 196           & 370           &\0790          & 209 \\
                    992: \tt cc         & 203           & 381           & 1094          & 275 \\
                    993: \tt gcc                & 186           & 411           & 1139          & 256 \\ %v1.34
                    994: \tt cc -O      & 150           & 296           &\0707          & 183 \\
                    995: \tt gcc -O     & 127           & 309           &\0788          & 179 \\
                    996: \end{tabular}
                    997: \end{center}
                    998: \caption{Execution Time for C SPEC Benchmarks in Seconds.\label{spec:execution}}
                    999: \end{table}
                   1000: 
                   1001: \verb|lcc| is faster than many (but not all~\cite{thompson90}) other C compilers.
                   1002: Table~\ref{spec:compilation} parallels Table~\ref{spec:execution},
                   1003: but shows compilation time instead of execution time.
                   1004: Except for the MIPS, the times are for running only the compiler proper;
                   1005: preprocessing, assembly, and linking time are not included.
                   1006: Two times are given for the MIPS because the manufacturer's \verb|cc| front end
                   1007: consists of two programs; the first translates C to ``u-code'' and the
                   1008: second generates object code. Generating assembly language
                   1009: costs {\em more} than generating object code, so Table~\ref{spec:compilation} gives
                   1010: both times for all compilers.
                   1011: The last row in Table~\ref{spec:compilation} lists the number of non-blank
                   1012: lines and the total number of lines in each benchmark {\em after} preprocessing.
                   1013: 
                   1014: \begin{table}
                   1015: \begin{center}
                   1016: \begin{tabular}{lcccc}
                   1017:                &\multicolumn{4}{c}{\it benchmark}\\
                   1018: \it compiler   & 1. \tt gcc1.35& 8. \tt espresso& 22. \tt li   & 23. \tt eqntott \\ \hline
                   1019: %\makebox[0pt][r]{VAX:\hspace{1em}}% megaron
                   1020: %\tt lcc       & 189           &\059           & 31            &\09 \\
                   1021: %\tt cc                & 405           & 114           & 40            & 26 \\
                   1022: %\tt gcc       & 505           & 222           & 74            & 45
                   1023: \\[-.5ex]
                   1024: \makebox[0pt][l]{VAX:}\\% v1.3; ffserver
                   1025: \tt lcc                &\0792          & 237           &\069           & 36 \\
                   1026: \tt cc         & 1878          & 576           & 174           & 79 \\
                   1027: \tt gcc                & 1910          & 637           & 192           & 86 \\[2ex]
                   1028: 
                   1029: \makebox[0pt][l]{68020:}\\% v1.3; hemlock
                   1030: \tt lcc                & 302           &\090           & 28            & 15 \\
                   1031: \tt cc         & 507           & 168           & 52            & 29 \\
                   1032: \tt gcc                & 599           & 196           & 56            & 27 \\[2ex]
                   1033: 
                   1034: \makebox[0pt][l]{MIPS:}\\% v1.3; oyoy
                   1035: \tt lcc                &\097 195       &\035 \063      & 10 24         &\06 16\\
                   1036: \tt cc         & 318 177       & 104 \068      & 40 26         & 24 19 \\
                   1037: \tt gcc                & 320 391       &\088  118      & 28 42         & 13 24 \\[2ex]
                   1038: 
                   1039: \makebox[0pt][l]{SPARC:}\\% v1.3; python
                   1040: \tt lcc                & 103           &\038           & 12            & \08 \\
                   1041: \tt cc         & 175           &\060           & 18            &  11 \\
                   1042: \tt gcc                & 313           & 100           & 31            &  16 \\[2ex]
                   1043: 
                   1044: line counts    & 79102/250496  & 25717/58516   & 7070/22494    & 2680/6569 \\
                   1045: \end{tabular}
                   1046: \end{center}
                   1047: \caption{Compilation Time for C SPEC Benchmarks in Seconds.\label{spec:compilation}}
                   1048: \end{table}
                   1049: 
                   1050: \verb|lcc| is smaller than other compilers.
                   1051: Table~\ref{sizes} lists the sizes of the three compilers in kilobytes.
                   1052: Each entry is the sum of sizes of the program and data segments
                   1053: for the indicated compiler as reported by the UNIX \verb|size| command.
                   1054: 
                   1055: \begin{table}
                   1056: \begin{center}
                   1057: \begin{tabular}{lcccc}
                   1058: \it compiler   & VAX           & 68020         & MIPS          & SPARC \\ \hline
                   1059: \tt lcc                & 181           & 244           & 280           & 276 \\ %v1.3
                   1060: \tt cc         & 256           & 306           & 616           & 402 \\
                   1061: \tt gcc                & 378           & 507           & 777           & 689 \\
                   1062: \end{tabular}
                   1063: \end{center}
                   1064: \caption{Sizes of Compiler Executables in Kilobytes.\label{sizes}}
                   1065: \end{table}
                   1066: 
                   1067: %\bibliography{refs,lib}
                   1068: \begin{thebibliography}{10}
                   1069: 
                   1070: \bibitem{aho:sethi:ullman:86}
                   1071: A.~V. Aho, R.~Sethi, and J.~D. Ullman.
                   1072: \newblock {\em Compilers: Principles, Techniques, and Tools}.
                   1073: \newblock Addison Wesley, Reading, MA, 1986.
                   1074: 
                   1075: \bibitem{ansi:Cstandard}
                   1076: American National Standard Institute, Inc., New York.
                   1077: \newblock {\em American National Standards for Information Systems, Programming
                   1078:   Language C {ANSI} X3.159--1989}, 1990.
                   1079: 
                   1080: \bibitem{baskett78}
                   1081: F.~Baskett.
                   1082: \newblock The best simple code generation technique for while, for, and do
                   1083:   loops.
                   1084: \newblock {\em SIGPLAN Notices}, 13(4):31--32, Apr. 1978.
                   1085: 
                   1086: \bibitem{bentley82}
                   1087: J.~L. Bentley.
                   1088: \newblock {\em Writing Efficient Programs}.
                   1089: \newblock Prentice Hall, Englewood Cliffs, NJ, 1982.
                   1090: 
                   1091: \bibitem{bernstein85}
                   1092: R.~L. Bernstein.
                   1093: \newblock Producing good code for the case statement.
                   1094: \newblock {\em Software---Practice \& Experience}, 15(10):1021--1024, Oct.
                   1095:   1985.
                   1096: 
                   1097: \bibitem{bsd84}
                   1098: Computer Science Division, Department of Electrical Engineering and Computer
                   1099:   Science, University of California, Berkeley, CA.
                   1100: \newblock {\em UNIX User's Manual, Reference Guide}, virtual {VAX}-11 version
                   1101:   edition, Mar. 1984.
                   1102: 
                   1103: \bibitem{fraser:sigplan89}
                   1104: C.~W. Fraser.
                   1105: \newblock A language for writing code generators.
                   1106: \newblock {\em Proceedings of the SIGPLAN'89 Conference on Programming Language
                   1107:   Design and Implementation, SIGPLAN Notices}, 24(7):238--245, July 1989.
                   1108: 
                   1109: \bibitem{fraser:hanson:91a}
                   1110: C.~W. Fraser and D.~R. Hanson.
                   1111: \newblock A code generation interface for {ANSI C}.
                   1112: \newblock {\em Software---Practice \& Experience}, 21(9):963--988, Sept. 1991.
                   1113: 
                   1114: \bibitem{fraser:hanson:92}
                   1115: C.~W. Fraser and D.~R. Hanson.
                   1116: \newblock Simple register spilling in a retargetable compiler.
                   1117: \newblock {\em Software---Practice \& Experience}, 22(1):85--99, Jan. 1992.
                   1118: 
                   1119: \bibitem{graham:kessler:mckusick:83}
                   1120: S.~L. Graham, P.~B. Kessler, and M.~K. McKusick.
                   1121: \newblock An execution profiler for modular programs.
                   1122: \newblock {\em Software---Practice \& Experience}, 13(8):671--685, Aug. 1983.
                   1123: 
                   1124: \bibitem{griswold:griswold:90}
                   1125: R.~E. Griswold and M.~T. Griswold.
                   1126: \newblock {\em The Icon Programming Language}.
                   1127: \newblock Prentice Hall, Englewood Cliffs, NJ, second edition, 1990.
                   1128: 
                   1129: \bibitem{hanson83}
                   1130: D.~R. Hanson.
                   1131: \newblock Simple code optimizations.
                   1132: \newblock {\em Software---Practice \& Experience}, 13(8):745--763, Aug. 1983.
                   1133: 
                   1134: \bibitem{hanson90}
                   1135: D.~R. Hanson.
                   1136: \newblock Fast allocation and deallocation of memory based on object lifetimes.
                   1137: \newblock {\em Software---Practice \& Experience}, 20(1):5--12, Jan. 1990.
                   1138: 
                   1139: \bibitem{kernighan:ritchie:88}
                   1140: B.~W. Kernighan and D.~M. Ritchie.
                   1141: \newblock {\em The C Programming Language}.
                   1142: \newblock Prentice Hall, Englewood Cliffs, NJ, second edition, 1988.
                   1143: 
                   1144: \bibitem{knuth71}
                   1145: D.~E. Knuth.
                   1146: \newblock An empirical study of {FORTRAN} programs.
                   1147: \newblock {\em Software---Practice \& Experience}, 1(2):105--133, Apr. 1971.
                   1148: 
                   1149: \bibitem{sedgewick88}
                   1150: R.~Sedgewick.
                   1151: \newblock {\em Algorithms}.
                   1152: \newblock Addison Wesley, Reading, MA, 1988.
                   1153: 
                   1154: \bibitem{sites78}
                   1155: R.~L. Sites.
                   1156: \newblock Programming tools: Statement counts and procedure timings.
                   1157: \newblock {\em SIGPLAN Notices}, 13(12):98--101, Dec. 1978.
                   1158: 
                   1159: \bibitem{spec89}
                   1160: Standards Performance Evaluation Corp.
                   1161: \newblock {\em {SPEC} Benchmark Suite Release 1.0}, Oct. 1989.
                   1162: 
                   1163: \bibitem{thompson90}
                   1164: K.~Thompson.
                   1165: \newblock A new {C} compiler.
                   1166: \newblock In {\em Proceedings of the Summer 1990 UKUUG Conference}, pages
                   1167:   41--51, London, July 1990.
                   1168: 
                   1169: \bibitem{waite86}
                   1170: W.~M. Waite.
                   1171: \newblock The cost of lexical analysis.
                   1172: \newblock {\em Software---Practice \& Experience}, 16(5):473--488, May 1986.
                   1173: 
                   1174: \bibitem{weinberger84}
                   1175: P.~J. Weinberger.
                   1176: \newblock Cheap dynamic instruction counting.
                   1177: \newblock {\em Bell System Technical Journal}, 63(8):1815--1826, Oct. 1984.
                   1178: 
                   1179: \end{thebibliography}
                   1180: 
                   1181: \end{document}

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.