|
|
1.1 ! root 1: \documentstyle{article} ! 2: ! 3: \begin{titlepage} ! 4: \normalsize ! 5: \vspace*{.7in} ! 6: \begin{center} ! 7: \begin{tabular}{c} ! 8: \bf\Large A Retargetable Compiler for ANSI C \\[.5in] ! 9: ! 10: Christopher W. Fraser \\ ! 11: \em AT\&T Bell Laboratories, 600 Mountain Avenue, \\ ! 12: \em Murray Hill, NJ 07974 \\[1ex] ! 13: and \\[1ex] ! 14: David R. Hanson \\ ! 15: \em Department of Computer Science, Princeton University, \\ ! 16: \em Princeton, NJ 08544 \\[.7in] ! 17: ! 18: Research Report CS-TR-303-91 \\[1ex] ! 19: February 1991 \\[.5in] ! 20: ! 21: \end{tabular} ! 22: \end{center} ! 23: ! 24: \begin{abstract} ! 25: \normalsize ! 26: \verb|lcc| is a new retargetable compiler for ANSI C. ! 27: Versions for the VAX, Motorola 68020, SPARC, and MIPS ! 28: are in production use at Princeton University ! 29: and at AT\&T Bell Laboratories. ! 30: With a few exceptions, ! 31: little about \verb|lcc| is unusual --- it integrates ! 32: several well engineered, existing techniques --- ! 33: but it is smaller and faster than most other C compilers, ! 34: and it generates code of comparable quality. ! 35: \verb|lcc|'s target-independent front end performs a ! 36: few simple, but effective, optimizations that contribute ! 37: to good code; examples include simulating register declarations ! 38: and partitioning switch statement cases into dense tables. ! 39: It also implements target-independent function tracing ! 40: and expression-level profiling. ! 41: \end{abstract} ! 42: ! 43: \end{titlepage} ! 44: ! 45: \begin{document} ! 46: %\bibliographystyle{abbrv} ! 47: ! 48: \setcounter{secnumdepth}{0} ! 49: ! 50: \section{Introduction} ! 51: ! 52: \verb|lcc| is a new retargetable compiler for ANSI C~\cite{ansi:Cstandard}. ! 53: It has been ported to the VAX, Motorola 68020, SPARC, and MIPS R3000, ! 54: and it is in production use at Princeton University ! 55: and at AT\&T Bell Laboratories. ! 56: When used with a compliant preprocessor and library, ! 57: \verb|lcc| passes the conformance section of ! 58: Version 2.00 of the Plum-Hall Validation Suite for ANSI~C.% ! 59: \footnote{The {\tt lcc} front end and a sample code generator are available ! 60: for anonymous {\tt ftp} from {\tt princeton.edu}. ! 61: The file {\tt README} in the directory {\tt pub/lcc} gives details. ! 62: It also describes the current availability of {\tt lcc}'s ! 63: production code generators.} ! 64: ! 65: Other reports describe \verb|lcc|'s storage manager~\cite{hanson90}, ! 66: intermediate language~\cite{fraser:hanson:91a}, code ! 67: generator~\cite{fraser:sigplan89}, and register ! 68: manager~\cite{fraser:hanson:92}. This report surveys the ! 69: remaining features of \verb|lcc| that may interest some readers. Chief ! 70: among these are its performance, some aspects of its design that ! 71: support this performance, and some features for debugging and profiling ! 72: user code. ! 73: ! 74: \section{Design} ! 75: ! 76: With a few exceptions, \verb|lcc| uses ! 77: well established compiler techniques. ! 78: The front end performs lexical, syntactic, and semantic analysis, and ! 79: some machine-independent optimizations, which are described below. ! 80: Both the lexical analyzer and the recursive-descent parser are hand-written. ! 81: Theoretically, this approach complicates both future changes and fixing errors, ! 82: but accommodating change is less important for a standardized language like ANSI~C, ! 83: and there have been few lexical or syntactic errors. ! 84: Indeed, less than 15~percent of \verb|lcc|'s code concerns parsing, ! 85: and the error rate in that code is negligible. ! 86: Despite its theoretical prominence, ! 87: parsing is a relatively minor component in \verb|lcc| as in other compilers; ! 88: semantic analysis, optimization, and code generation are the major components ! 89: and account for most of the code and most of the errors. ! 90: ! 91: The target-independent front end and a target-dependent back end are ! 92: packaged as single program, tightly coupled by a ! 93: compact, efficient interface. ! 94: The interface consists of a few shared data structures, ! 95: 17 functions, and a 36-operator dag language. ! 96: The functions emit function prologues, ! 97: define globals, emit data, etc., and most are simple. ! 98: The dag language encodes the executable code from a source program; it ! 99: corresponds to the ``intermediate language'' used in other compilers, but ! 100: it is smaller than typical intermediate languages. ! 101: Reference~\cite{fraser:hanson:91a} describes the interface. ! 102: ! 103: Code generators are generated automatically from compact, rule-based ! 104: specifications~\cite{fraser:sigplan89}. Some rules rewrite ! 105: intermediate code as naive assembly code. Others peephole-optimize the ! 106: result. They are compiled into a monolithic hard-coded program that ! 107: accepts dags annotated with intermediate code, and generates, ! 108: optimizes, and emits code for the target machine. Hard-coding ! 109: contributes significantly to \verb|lcc|'s speed. ! 110: ! 111: Table~\ref{linecounts} shows the number of lines ! 112: in each of \verb|lcc|'s components. The notation $h+c$ indicates ! 113: $h$ lines of definitions in ``header'' files and $c$ lines of C code. ! 114: The ``back-end support'' is back-end code that ! 115: shared by four back ends, e.g., initialization and ! 116: most of the register manager. ! 117: ! 118: \begin{table} ! 119: \begin{center} % for v1.3 ! 120: \newlength{\W}\settowidth{\W}{symbol table emitters\ } ! 121: \begin{tabular}{l|rcc} ! 122: & & & \it generated \\ ! 123: \hfil\it component & \omit\hfil\it code\hfil& \it rules & \it code generator \\ \hline ! 124: front end & 968+7847 \\[2ex] ! 125: back-end support & 114+741 \\ ! 126: VAX & 35+170 & 178 & 5782 \\ ! 127: MIPS & 40+378 & 104 & 2966 \\ ! 128: 68020 & 42+190 & 145 & 8301 \\ ! 129: SPARC & 40+290 & 128 & 3888 \\[2ex] ! 130: \parbox{\W}{VAX+68020+SPARC\\[-2pt] ! 131: symbol table emitters} & 584 \\[2ex] ! 132: \parbox{\W}{naive VAX\\[-2pt] ! 133: code generator} & 67+578 \\[2ex] ! 134: rule compiler & 1285 \\ ! 135: \end{tabular} ! 136: \end{center} ! 137: \caption{Number of Lines in {\tt lcc} Components.\label{linecounts}} ! 138: \end{table} ! 139: ! 140: ! 141: Target-specific files include a ! 142: configuration header file, which defines parameters like ! 143: the widths and alignments of the basic datatypes, ! 144: target-specific interface functions, e.g., those that emit function prologues, ! 145: and code generation rules, from which the code generators are generated ! 146: by the rule compiler, which is written in Icon~\cite{griswold:griswold:90}. ! 147: Retargeting \verb|lcc| requires involves writing these ! 148: three back-end components, which vary from 377 to 522 lines in existing back ends. ! 149: In practice, new back ends are implemented ! 150: by writing new rules and editing copies ! 151: of an existing configuration and set of interface functions. ! 152: ! 153: All of \verb|lcc|'s production back ends use the technology ! 154: summarized above and detailed in Reference~\cite{fraser:sigplan89}. ! 155: The interface between the front and back end does not depend ! 156: on this technology; other back ends that conform to the interface ! 157: specification can be used. For example, Reference~\cite{fraser:hanson:91a} ! 158: details a hand-written code generator that emits naive VAX code. ! 159: ! 160: While \verb|lcc| uses well established techniques, it uses some of their ! 161: more recent incarnations, each of which contributes to \verb|lcc|'s efficiency ! 162: as described below. ! 163: ! 164: \subsection{Lexical Analysis} ! 165: ! 166: The design of the input module and of ! 167: the lexical analyzer and judicious code tuning ! 168: of the lexical analyzer contribute to ! 169: \verb|lcc|'s speed. ! 170: ! 171: Input and lexical analysis use ! 172: variations of the design described in Reference~\cite{waite86}. ! 173: Since the lexical analyzer is the only module that ! 174: inspects every input character, the design ! 175: avoids extraneous per-character processing and ! 176: minimizes character movement by scanning tokens directly out ! 177: of a large input buffer. ! 178: ! 179: Input is read directly from the operating system ! 180: into a 4096-character buffer as depicted ! 181: in Figure~\ref{fig:buffering}a, and \verb|cp| and \verb|limit| ! 182: delimit the unscanned portion of the buffer. ! 183: The next token is scanned by advancing \verb|cp| across ! 184: white space and switching on the first character of ! 185: the token, \verb|*cp|. \verb|cp| is advanced as the token is recognized. ! 186: ! 187: \begin{figure} ! 188: \begin{center} ! 189: \setlength{\unitlength}{10pt} ! 190: \begin{picture}(26.3,21) ! 191: \put(0,2){ ! 192: \begin{picture}(26.3,7) ! 193: \thicklines ! 194: \put( 0, 3){\framebox(26.3, 2)[r]{\tt \char`\\n}} ! 195: \thinlines ! 196: \put( 5, 3){\line(0,1){2}} ! 197: \put(25, 3){\line(0,1){2}} ! 198: \put( 3, 1){\vector(0,1){2}}\put(3, 0){\makebox(0,1){\tt cp}} ! 199: \put(25.7, 1){\vector(0,1){2}}\put(25.7, 0){\makebox(0,1){\tt limit}} ! 200: \put(22, 1){\vector(0,1){2}}\put(22, 0){\makebox(0,1){\tt fence}} ! 201: \put(15, 6){\makebox(0,0){4096 characters}} ! 202: \put( 5, 5.5){\line(0,1){1}} ! 203: \put(25, 5.5){\line(0,1){1}} ! 204: \put(11, 6){\vector(-1,0){6}} ! 205: \put(19, 6){\vector( 1,0){6}} ! 206: \end{picture} ! 207: } ! 208: \put(13.15,0){\makebox(0,1){(b) After a \tt read}} ! 209: ! 210: \put(0,14){ ! 211: \begin{picture}(26.3,7) ! 212: \thicklines ! 213: \put( 0, 3){\framebox(26.3, 2)[r]{\tt \char`\\n}} ! 214: \thinlines ! 215: \put( 5, 3){\line(0,1){2}} ! 216: \put(25, 3){\line(0,1){2}} ! 217: \put(10, 1){\vector(0,1){2}}\put(10, 0){\makebox(0,1){\tt cp}} ! 218: \put(25.7, 1){\vector(0,1){2}}\put(25.7, 0){\makebox(0,1){\tt limit}} ! 219: \put(22, 1){\vector(0,1){2}}\put(22, 0){\makebox(0,1){\tt fence}} ! 220: \put(15, 6){\makebox(0,0){4096 characters}} ! 221: \put( 5, 5.5){\line(0,1){1}} ! 222: \put(25, 5.5){\line(0,1){1}} ! 223: \put(11, 6){\vector(-1,0){6}} ! 224: \put(19, 6){\vector( 1,0){6}} ! 225: \end{picture} ! 226: } ! 227: \put(13.15,12){\makebox(0,1){(a) While ${\tt cp} < {\tt fence}$}} ! 228: ! 229: \end{picture} ! 230: \end{center} ! 231: \caption{Input Buffering.\label{fig:buffering}} ! 232: \end{figure} ! 233: ! 234: Newlines, denoted by \verb|\n|, cannot occur within C tokens, ! 235: which explains the newline at \verb|*limit| shown in Figure~\ref{fig:buffering}. ! 236: This newline terminates a scan for any token ! 237: so a separate, per-character test for the end of the buffer is unnecessary. ! 238: When a newline is encountered, an input module function is called ! 239: to refill the input buffer, ! 240: if necessary, and to increment the line number. ! 241: ! 242: ANSI~C stipulates a maximum line length of no less than 509, ! 243: but few compilers insist on a specific limit. ! 244: Tokens, however, can be limited to 32 characters; ! 245: string literals are an exception, ! 246: but they are handled as a special case. ! 247: ! 248: In general, an input buffer ends with a partial token. ! 249: To insure that an entire token lies between ! 250: \verb|cp| and \verb|limit|, the end of the buffer is moved to the memory ! 251: locations {\em preceding} the buffer whenever ! 252: \verb|cp| passes \verb|fence|. Doing so concatenates ! 253: a partial token with its tail after the next read ! 254: as shown in Figure~\ref{fig:buffering}b. ! 255: Testing if \verb|cp| has passed \verb|fence| is done ! 256: for each token after \verb|cp| is advanced across white space. ! 257: ! 258: The important consequence of this design is that most ! 259: of the input characters are accessed by \verb|*cp| ! 260: and many are never moved. Only identifiers (excluding keywords) and string ! 261: literals that appear in executable code ! 262: are copied out of the buffer into permanent storage. ! 263: ! 264: Reference~\cite{waite86}'s algorithm moves partial {\em lines} instead ! 265: of partial tokens and does so after scanning the {\em first} newline in ! 266: the buffer. But this operation overwrites storage before the buffer ! 267: when a partial line is longer than a fixed maximum. The algorithm ! 268: above avoids this problem, but at the per-token cost of comparing ! 269: \verb|cp| with \verb|fence|. ! 270: ! 271: Instead of actually using \verb|cp| as suggested above, ! 272: \verb|cp| is copied to the register variable \verb|rcp| ! 273: upon entry to the lexical analyzer, and \verb|rcp| is used in ! 274: token recognition. \verb|rcp| is assigned to \verb|cp| ! 275: before the lexical analyzer returns. ! 276: Using \verb|rcp| improves performance ! 277: and makes scanning loops compact and fast, e.g., white space is elided by ! 278: \begin{verbatim} ! 279: while (map[*rcp]&BLANK) ! 280: rcp++; ! 281: \end{verbatim} ! 282: \verb|map[c]| is a mask that classifies character \verb|c| as suggested ! 283: in Reference~\cite{waite86}; e.g., ! 284: \verb|map[c]&BLANK| is non-zero if \verb|c| is a ! 285: white-space character (but not a newline). ! 286: \verb|lcc| generates four VAX instructions for the body of this loop: ! 287: \begin{verbatim} ! 288: jbr L142 ! 289: L141: incl r11 ! 290: L142: cvtbl (r11),r5 ! 291: bicl3 $-2,_map[r5],r5 ! 292: jneq L141 ! 293: \end{verbatim} ! 294: \verb|rcp| is register \verb|r11|. ! 295: Some optimizing compilers can make similar improvements locally, but not across ! 296: potentially aliased assignments and calls to other, irrelevant functions. ! 297: ! 298: Keywords are recognized by a hard-coded decision tree, e.g., ! 299: \begin{verbatim} ! 300: case 'i': ! 301: if (rcp[0] == 'f' ! 302: && !(map[rcp[1]]&(DIGIT|LETTER))) { ! 303: cp = rcp + 1; ! 304: return IF; ! 305: } ! 306: if (rcp[0] == 'n' ! 307: && rcp[1] == 't' ! 308: && !(map[rcp[2]]&(DIGIT|LETTER))) { ! 309: cp = rcp + 2; ! 310: return INT; ! 311: } ! 312: goto id; ! 313: \end{verbatim} ! 314: \verb|IF| and \verb|INT| are defined as the token codes for ! 315: the keywords \verb|if| and \verb|int|, respectively, and ! 316: \verb|id| labels the code that scans identifiers. ! 317: This code is generated automatically by a 50-line C program ! 318: and included in the lexical analyzer during compilation. ! 319: ! 320: The VAX code generated for this fragment follows; ! 321: again, \verb|r11| is \verb|rcp|. ! 322: \begin{verbatim} ! 323: L347: cmpb (r11),$102 ! 324: jneq L348 ! 325: cvtbl 1(r11),r5 ! 326: bicl3 $-13,_map[r5],r5 ! 327: jneq L348 ! 328: addl3 $1,r11,_cp ! 329: movl $77,r0 ! 330: ret ! 331: L348: cmpb (r11),$110 ! 332: jneq L226 ! 333: cmpb 1(r11),$116 ! 334: jneq L226 ! 335: cvtbl 2(r11),r5 ! 336: bicl3 $-13,_map[r5],r5 ! 337: jneq L226 ! 338: addl3 $2,r11,_cp ! 339: movl $5,r0 ! 340: ret ! 341: \end{verbatim} ! 342: Thus, the keyword \verb|int| is recognized by less than ! 343: a dozen instructions, many less than are executed ! 344: when a table is searched for keywords, even if perfect hashing is used. ! 345: ! 346: As in other compilers~\cite{aho:sethi:ullman:86}, ! 347: strings that must be saved (identifiers and string literals) ! 348: are hashed into a table in which a string appears only once, ! 349: which saves space. For performance, there are variants for installing strings of digits ! 350: and strings of known length. ! 351: After installation, strings are known by their addresses ! 352: and the characters are accessed only for output. ! 353: For example, looking a name up in the symbol table is ! 354: done by hashing on the address of the name; string comparison is unnecessary. ! 355: ! 356: \subsection{Symbol Tables} ! 357: ! 358: Fast symbol table manipulation also contributes to ! 359: \verb|lcc|'s speed. It took several versions ! 360: of the symbol table module to arrive at the current one, however. ! 361: ! 362: Symbols are represented with structures defined by ! 363: \begin{verbatim} ! 364: struct symbol { ! 365: char *name; /* symbol name */ ! 366: int scope; /* scope level */ ! 367: ... ! 368: }; ! 369: \end{verbatim} ! 370: The symbol table module uses hash ! 371: tables for symbol tables; the initial version used ! 372: a single table for all scopes, i.e., ! 373: \begin{verbatim} ! 374: struct entry { ! 375: struct symbol sym; /* this symbol */ ! 376: struct entry *link; /* next entry on hash chain */ ! 377: }; ! 378: struct table { ! 379: struct entry *buckets[HASHSIZE]; /* hash buckets */ ! 380: }; ! 381: \end{verbatim} ! 382: Symbols are wrapped in \verb|entry| structures to keep ! 383: the linkage information private to the symbol table module. ! 384: ! 385: Scope entry required no code. ! 386: Each new symbol was added to the head of its hash chain ! 387: and thereby hid symbols with the same names, ! 388: which appeared further down on the same chains. ! 389: At scope exit, however, entries at the current ! 390: scope level, indicated by the value of \verb|level|, ! 391: were removed from the table \verb|*tp| by the code ! 392: \begin{verbatim} ! 393: for (i = 0; i < HASHSIZE; i++) { ! 394: struct entry *p = tp->buckets[i]; ! 395: while (p && p->sym.scope == level) ! 396: p = p->link; ! 397: tp->buckets[i] = p; ! 398: } ! 399: \end{verbatim} ! 400: Measurements revealed that this code accounted for over ! 401: 5~percent of \verb|lcc|'s execution time on typical input. ! 402: This code scanned the hash buckets even for scopes that ! 403: introduce no new symbols, which are common in C. ! 404: ! 405: The second version of the symbol table module used ! 406: a separate hash table for each scope level: ! 407: \begin{verbatim} ! 408: struct table { ! 409: struct table *previous; /* table at lower scope */ ! 410: struct entry *buckets[HASHSIZE]; /* hash buckets */ ! 411: }; ! 412: \end{verbatim} ! 413: Searching for a symbol took the same number ! 414: of comparisons, but also required a traversal ! 415: of the list of separate tables, e.g., ! 416: \begin{verbatim} ! 417: struct symbol *lookup(char *name, struct table *tp) { ! 418: struct entry *p; ! 419: unsigned h = ((unsigned)name)&(HASHSIZE-1); ! 420: ! 421: do ! 422: for (p = tp->buckets[h]; p; p = p->link) ! 423: if (name == p->sym.name) ! 424: return &p->sym; ! 425: while (tp = tp->previous); ! 426: return 0; ! 427: } ! 428: \end{verbatim} ! 429: Notice that symbol names are compared by simply ! 430: comparing addresses as explained in the previous section. ! 431: Despite the conventional wisdom about hashing functions~\cite{sedgewick88}, ! 432: using a power of two for \verb|HASHSIZE| gave ! 433: better performance; using a prime instead and modulus in place ! 434: of masking slowed \verb|lcc|. ! 435: ! 436: This variation reduced the scope exit code to ! 437: \begin{verbatim} ! 438: tp = tp->previous ! 439: \end{verbatim} ! 440: for table \verb|*tp|. Unfortunately, scope entry ! 441: then required allocation and initialization of a table: ! 442: \begin{verbatim} ! 443: struct table *new = (struct table *)alloc(sizeof *new); ! 444: new->previous = tp; ! 445: for (i = 0; i < HASHSIZE; i++) ! 446: new->buckets[i] = 0; ! 447: tp = new; ! 448: \end{verbatim} ! 449: So, the time wasted at scope exit in the first version ! 450: was traded for a similar waste at scope entry in the second version. ! 451: ! 452: The symbol table module in actual use avoids ! 453: this waste by lazy allocation and initialization ! 454: of tables. Tables include their associated scope level: ! 455: \begin{verbatim} ! 456: struct table { ! 457: int level; /* scope level for this table */ ! 458: struct table *previous; /* table at lower scope */ ! 459: struct entry *buckets[HASHSIZE]; /* hash buckets */ ! 460: }; ! 461: \end{verbatim} ! 462: New tables are allocated and initialized only ! 463: when a symbol is installed: ! 464: \begin{verbatim} ! 465: struct symbol *install(char *name, struct table **tpp) { ! 466: unsigned h = ((unsigned)name)&(HASHSIZE-1); ! 467: struct table *tp = *tpp; ! 468: struct entry *p = (struct entry *)alloc(sizeof *p); ! 469: ! 470: if (tp->level < level) { ! 471: int i; ! 472: struct table *new = (struct table *)alloc(sizeof *new); ! 473: new->previous = tp; ! 474: new->level = level; ! 475: for (i = 0; i < HASHSIZE; i++) ! 476: new->buckets[i] = 0; ! 477: *tpp = tp = new; ! 478: } ! 479: p->sym.name = name; ! 480: p->sym.scope = tp->level; ! 481: ... ! 482: p->link = tp->buckets[h]; ! 483: tp->buckets[h] = p; ! 484: return &p->sym; ! 485: } ! 486: \end{verbatim} ! 487: Since few scopes in C, which are delimited by compound statements, ! 488: declare new symbols, the lazy allocation code above is rarely ! 489: executed and entry to most scopes is nearly free. ! 490: The scope exit code must check before ! 491: discarding a table, but remains simple: ! 492: \begin{verbatim} ! 493: if (tp->level == level) ! 494: tp = tp->previous; ! 495: \end{verbatim} ! 496: ! 497: This design also simplifies access to separate tables. ! 498: For example, the table that holds globals is at the end ! 499: of the list of identifier tables; by making it ! 500: the value of \verb|globals|, symbols can be ! 501: installed into it directly. ! 502: In the initial implementation, a global declared at a nested ! 503: scope had to be inserted in the middle of its hash chain. ! 504: ! 505: \subsection{Storage Management} ! 506: ! 507: Allocation and deallocation in early versions of \verb|lcc| ! 508: accounted for a significant portion of the total execution time. ! 509: Replacing the naive use of \verb|malloc| and \verb|free| ! 510: reduced total execution time by about 8--10~percent. ! 511: As detailed in Reference~\cite{hanson90}, ! 512: allocation is based on the lifetime of the objects allocated, ! 513: and all objects with the same lifetime are freed at once. ! 514: ! 515: This approach to storage management simplified \verb|lcc|'s code. ! 516: Initially, each object type had explicit deallocation code, perhaps ! 517: replicated at several points. Some of this code was intricate, e.g., ! 518: involving complex loops or recursive data structure traversals. ! 519: Allocation incurred an obligation to provide the necessary ! 520: deallocation code, so there was a tendency to use ! 521: algorithms that avoided allocation, ! 522: perhaps at the expense of time, complexity, and flexibility. ! 523: And it was easy to forget deallocation, resulting in storage leaks. ! 524: ! 525: The current scheme eliminated nearly all explicit deallocation code, ! 526: which simplified the compiler and eliminated storage leaks. ! 527: More importantly, it encouraged the use of simple applicative algorithms, ! 528: e.g., in rewriting trees. ! 529: The replacements cost space, but not time, since ! 530: allocation and deallocation are nearly free. ! 531: Besides contributing to fast compilation, the other visible ! 532: benefit of this approach is that \verb|lcc| imposes few ! 533: arbitrary limits on its input; e.g., it permits ! 534: any number of cases in switch statements, ! 535: any number of parameters and locals, block nesting to any depth, ! 536: expressions of arbitrary complexity, initializations of arbitrary size, etc. ! 537: These quantities are limited only by the memory available. ! 538: ! 539: ! 540: \section{Optimization} ! 541: ! 542: \verb|lcc| is not properly called an ``optimizing'' compiler ! 543: because it does no global optimization, {\em per se}. ! 544: Its front end does, however, perform some simple, target-independent ! 545: transformations that help its back ends generate good local code. ! 546: ! 547: The front end eliminates local common subexpressions, ! 548: folds constant expressions, and makes numerous simple ! 549: transformations that improve the quality of local code~\cite{hanson83}. ! 550: Many of these improvements are simple tree transformations ! 551: that lead to better addressing code. ! 552: ! 553: The front end lays out loops so as to reduce the number ! 554: of unconstructive branches~\cite{baskett78}, e.g., the code for ! 555: \begin{flushleft} ! 556: \tt for ($e_1$; $e_2$; $e_3$) $S$ ! 557: \end{flushleft} ! 558: has the form ! 559: \begin{flushleft}\tt ! 560: \begin{tabular}{ll} ! 561: & $e_1$ \\ ! 562: & goto L1 \\ ! 563: L2: & $S$ \\ ! 564: L3: & $e_3$ \\ ! 565: L1: & if ($e_2$) goto L2 \\ ! 566: \end{tabular} ! 567: \end{flushleft} ! 568: The \verb|goto L1| is omitted if $e_2$ is initially non-zero. ! 569: In addition, the front end eliminates branch chains and dead branches. ! 570: ! 571: The selection code for switch statements is generated entirely by the front end. ! 572: It generates a binary search of dense branch tables~\cite{bernstein85}, ! 573: where the density is the percentage of non-default branch table entries. ! 574: For example, with the default density of 0.5, a switch statement ! 575: with the case values 1, 2, 6--8, 1001--1004, and 2001--2002 has the ! 576: following VAX selection code. ! 577: Register \verb|r4| holds the value of the switch expression, ! 578: \verb|L3|--\verb|15| label the statements for the case values above, ! 579: and \verb|L1| is the default label. ! 580: \begin{verbatim} ! 581: cmpl r4,$1001 ! 582: jlss L17 ! 583: cmpl r4,$1004 ! 584: jgtr L16 ! 585: movl _18-4004[r4],r5 ! 586: jmp (r5) ! 587: _18: .long L8 ! 588: .long L9 ! 589: .long L10 ! 590: .long L11 ! 591: L17: cmpl r4,$1 ! 592: jlss L1 ! 593: cmpl r4,$8 ! 594: jgtr L1 ! 595: movl _21-4[r4],r5 ! 596: jmp (r5) ! 597: _21: .long L3 ! 598: .long L4 ! 599: .long L1 ! 600: .long L1 ! 601: .long L1 ! 602: .long L5 ! 603: .long L6 ! 604: .long L7 ! 605: L16: cmpl r4,$2001 ! 606: jlss L1 ! 607: cmpl r4,$2004 ! 608: jgtr L1 ! 609: movl _24-8004[r4],r5 ! 610: jmp (r5) ! 611: _24: .long L12 ! 612: .long L13 ! 613: .long L14 ! 614: .long L15 ! 615: \end{verbatim} ! 616: The density can be changed by a command-line option; ! 617: e.g., \verb|-d0| yields a single branch ! 618: table for each switch statement, and \verb|-d1| requires that all ! 619: branch tables be fully populated. ! 620: ! 621: Finally, the front end simulates register declarations for ! 622: all scalar parameters and locals that are referenced at least ! 623: 3 times and do not have their addresses taken explicitly. ! 624: Locals are announced to the back ends with ! 625: explicitly declared \verb|register| locals followed by ! 626: the remaining locals in the order of decreasing frequency of use. ! 627: Each top-level occurrence of an identifier ! 628: counts as 1 reference. Occurrences in a loop, ! 629: either of the then/else arms of an if statement, or a case ! 630: in a switch statement each count, respectively, as 10, $1/2$, or $1/10$ references. ! 631: These values are adjusted to account for nested control structures. ! 632: The next section describes how these estimated counts ! 633: may be replaced with counts from an actual profile. ! 634: ! 635: This scheme simplifies register assignment in the back ends, ! 636: and explicit \verb|register| declarations are rarely necessary. ! 637: For example, ! 638: \begin{verbatim} ! 639: strcpy(char *s1, char *s2) { while (*s1++ = *s2++); } ! 640: \end{verbatim} ! 641: yields the VAX code ! 642: \begin{verbatim} ! 643: _strcpy:.word 0x0 ! 644: movl 4(ap),r4 ! 645: movl 8(ap),r5 ! 646: L26: movb (r5)+,(r4)+ ! 647: jneq L26 ! 648: ret ! 649: \end{verbatim} ! 650: ! 651: ! 652: \section{Features} ! 653: ! 654: \verb|lcc| provides a few noteworthy features that help users develop, ! 655: debug, and profile ANSI~C programs. ! 656: For example, an option causes ! 657: \verb|lcc| to print ANSI-style C declarations for all defined globals and functions. ! 658: For instance, the code (adapted from Section~6.2 of Reference~\cite{kernighan:ritchie:88}) ! 659: \begin{verbatim} ! 660: typedef struct point { int x,y; } point; ! 661: typedef struct rect { point pt1, pt2; } rect; ! 662: ! 663: point addpoint(p1, p2) point p1, p2; { ! 664: p1.x += p2.x; ! 665: p1.y += p2.y; ! 666: return p1; ! 667: } ! 668: int ptinrect(p, r) point p; rect r; { ! 669: return p.x >= r.pt1.x && p.x < r.pt2.x ! 670: && p.y >= r.pt1.y && p.y < r.pt2.y; ! 671: } ! 672: \end{verbatim} ! 673: generates the declarations ! 674: \begin{verbatim} ! 675: extern point addpoint(point, point); ! 676: extern int ptinrect(point, rect); ! 677: \end{verbatim} ! 678: Editing such output can simplify conversion to ANSI~C. ! 679: ! 680: Another option causes \verb|lcc| to ! 681: issue warnings for declarations and casts of function types without prototypes. ! 682: These include pointers to functions, which are easy to overlook ! 683: when updating pre-ANSI code. For example, it is likely ! 684: that \verb|char *(_alloc)()| should ! 685: be updated to be \verb|char *(_alloc)(size_t)|. ! 686: ! 687: \subsection{Debugging} ! 688: ! 689: \verb|lcc| supports the standard debugger symbol tables on VAXes and Suns. ! 690: It also has two options of its own to assist in program debugging. ! 691: ! 692: Dereferencing zero pointers is a frequent C programming error. ! 693: On some systems, execution continues until the consequences ! 694: cause a fault somewhere unrelated to the actual point of error. ! 695: To help catch such errors, an option causes \verb|lcc| ! 696: to generate code to test for dereferencing zero pointers. If a zero pointer ! 697: is detected, the offending file name and line number are reported on the standard error, e.g., ! 698: \begin{verbatim} ! 699: null pointer dereferenced @foo.c:36 ! 700: \end{verbatim} ! 701: and the program terminates by calling the standard library function \verb|abort|. ! 702: ! 703: Some languages provide built-in facilities for tracing function calls ! 704: and returns~\cite{griswold:griswold:90}. ! 705: An option instructs \verb|lcc| to generate calls to \verb|printf| (or a ! 706: user-specified equivalent) just after entry to each function ! 707: and just before each return. ! 708: The entry code prints the arguments and the return code prints the value ! 709: returned. For example, calling the ! 710: functions shown above would elicit messages like ! 711: \begin{verbatim} ! 712: addpoint#2(p1=(point){x=0,y=0},p2=(point){x=10,y=10}) called ! 713: addpoint#2 returned (point){x=10,y=10} ! 714: ... ! 715: ptinrect#1(p=(point){x=-1,y=-1}, ! 716: r=(rect){pt1=(point){x=10,y=10},pt2=(point){x=310,y=310}}) called ! 717: ptinrect#1 returned 0 ! 718: \end{verbatim} ! 719: (Long lines have been folded to fit this page.) ! 720: As illustrated by this output, ! 721: the messages show the full details of the arguments, including structure contents. ! 722: The numbers that follow function names, e.g., \verb|#2|, ! 723: are activation numbers and can help locate a specific call and its return. ! 724: ! 725: These debugging options are implemented entirely in the front end and ! 726: thus are available on all of \verb|lcc|'s targets. ! 727: ! 728: \subsection{Profiling} ! 729: ! 730: \verb|lcc| supports \verb|prof|-style (viz.~\cite[\verb|prof| command]{bsd84}) ! 731: and \verb|gprof|-style~\cite{graham:kessler:mckusick:83} execution profiling ! 732: on VAXes and Suns. ! 733: These profilers sample the location counter periodically ! 734: to obtain an estimate of the percentage of total execution time ! 735: spent in each function, and they report the number of calls to each function. ! 736: ! 737: Heeding long-standing advice~\cite{knuth71,sites78}, \verb|lcc| also supports ! 738: frequency-based profiling. ! 739: An option causes \verb|lcc| to emit counters that record the number ! 740: of times each {\em expression} is executed, and the values of these counters are ! 741: written to the file \verb|prof.out| when the program terminates. ! 742: A companion program, \verb|bprint|, reads \verb|prof.out| and prints ! 743: the source code annotated with execution counts, e.g., ! 744: \begin{verbatim} ! 745: ... ! 746: 4 main() ! 747: 5 <1>{ ! 748: ... ! 749: 12 <1>queens(0); ! 750: 13 return <1>0; ! 751: 14 <1>} ! 752: 15 ! 753: 16 queens(c) ! 754: 17 <1965>{ ! 755: 18 int r; ! 756: 19 ! 757: 20 for (<1965>r = 0; <15720>r < 8; <15720>r++) ! 758: 21 if (<15720>rows[r] && <5508>up[r-c+7] && <3420>down[r+c]){ ! 759: 22 <2056>rows[r] = up[r-c+7] = down[r+c] = 0; ! 760: 23 <2056>x[c] = r; ! 761: 24 if (<2056>c == 7) ! 762: 25 <92>print(); ! 763: 26 else ! 764: 27 <1964>queens(c + 1); ! 765: 28 <2056>rows[r] = up[r-c+7] = down[r+c] = 1; ! 766: 29 } ! 767: 30 <1965>} ! 768: ... ! 769: \end{verbatim} ! 770: Execution counts are enclosed in angle brackets. ! 771: The counts on the outermost braces for \verb|queens| ! 772: give the number of calls. ! 773: Line 21 shows the benefit of associating a count ! 774: with each expression instead of each line; ! 775: the counts reveal that \verb|up[r-c+7]| was tested only slightly more than one-third ! 776: of the number of times the if statement was executed. ! 777: Conditional expressions are annotated similarly. ! 778: ! 779: Users sometimes report an ``off-by-one'' bug when they see that ! 780: \verb|r < 8| in line 20 was executed ! 781: the same number of times as \verb|r++|. ! 782: These counts are a consequence of the way \verb|lcc| lays out for loops ! 783: and eliminates the test before the first iteration, as described above. ! 784: ! 785: Data in \verb|prof.out| accumulates, so it is possible to execute ! 786: a program repeatedly and then have \verb|bprint| display the ! 787: cumulative frequencies. ! 788: This method is particularly useful for developing ! 789: test data that exercises all parts of a program: ! 790: \verb|<0>| highlights untested code. ! 791: ! 792: Another option causes \verb|lcc| to read \verb|prof.out| ! 793: and use the counts therein to compute the frequency of use ! 794: of each identifier instead of using the estimates ! 795: described in the previous section. Doing so may reduce the ! 796: number of uses for identifiers that appear in ! 797: loops that rarely executed more than once, ! 798: and increase the number of uses for those that appear ! 799: in then/else arms that are executed most of the time. ! 800: ! 801: Complex preprocessor macros can obscure \verb|bprint|'s presentation. ! 802: It necessarily uses post-expansion source coordinates ! 803: to annotate pre-expansion source files. ! 804: ! 805: Profiling code also records the number of calls made ! 806: from each call site, which can be used to reconstruct the dynamic ! 807: call graph. \verb|bprint| prints a line for each edge, e.g., ! 808: \begin{verbatim} ! 809: 1 queens from main in 8q.c:12.8 ! 810: 1964 queens from queens in 8q.c:27.11 ! 811: 92 print from queens in 8q.c:25.10 ! 812: \end{verbatim} ! 813: This output shows that all but one of the calls to \verb|queens| was from ! 814: the call at character 11 in line 27. ! 815: This kind of data is particularly helpful in identifying ! 816: hot spots that are caused by inappropriate calls to a function ! 817: instead of inefficiencies within the function itself. ! 818: Such data can also help identify functions that might profitably ! 819: be replaced with two functions so that one can handle ! 820: the common case more efficiently~\cite[Sec.~5.3]{bentley82}. ! 821: ! 822: Expression execution frequency profiling is implemented entirely by the ! 823: front end. The only machine dependency is the name of the ultimate ! 824: termination function in the revised \verb|exit| function that writes ! 825: \verb|prof.out| at program termination. ! 826: ! 827: The implementation is a machine-independent variation of the method ! 828: described in Reference~\cite{weinberger84}. The front end generates an ! 829: array of counters for each file and starts each expression with code to ! 830: increment the appropriate counter. In also builds a parallel array that ! 831: holds the source coordinates corresponding to each counter. At the ! 832: entry point of each function, the front end generates the equivalent of ! 833: \begin{verbatim} ! 834: if (!_yylink.link) { ! 835: extern struct _bbdata *_bblist; ! 836: _yylink.link = _bblist; ! 837: _bblist = &yylink; ! 838: } ! 839: _prologue(&callee); ! 840: \end{verbatim} ! 841: A \verb|_bbdata| structure is generated for each file: ! 842: \begin{verbatim} ! 843: static struct _bbdata { ! 844: struct _bbdata *link; ! 845: unsigned npoints; ! 846: unsigned *counts; ! 847: unsigned *coords; ! 848: struct func *funcs; ! 849: } _yylink; ! 850: \end{verbatim} ! 851: The \verb|counts| and \verb|coords| fields point the arrays mentioned above, ! 852: which each have \verb|npoints| entries. ! 853: The entry point code uses the \verb|link| field to ! 854: add each file's \verb|_bbdata| structure ! 855: to the list headed by \verb|_bblist|, which the revised \verb|exit| function walks ! 856: to emit \verb|prof.out|. ! 857: ! 858: \verb|_prologue| accumulates the dynamic call graph. ! 859: It is passed one of the \verb|func| structures --- one for each function --- that appear ! 860: on the list emanating from \verb|_yylink.funcs|: ! 861: \begin{verbatim} ! 862: struct func { ! 863: struct func *link; ! 864: struct caller { ! 865: struct caller *link; ! 866: struct callsite *caller; ! 867: unsigned count; ! 868: } *callers; ! 869: char *name; ! 870: unsigned coord; ! 871: }; ! 872: \end{verbatim} ! 873: The \verb|name| and \verb|coord| fields give the function's name and beginning source ! 874: coordinate, respectively. ! 875: \verb|callers| points to a list of \verb|caller| structures, one for each ! 876: call site. Each \verb|caller| structure records the number of calls ! 877: from the caller's \verb|callsite|: ! 878: \begin{verbatim} ! 879: struct callsite { ! 880: char *file; ! 881: char *name; ! 882: unsigned coord; ! 883: }; ! 884: \end{verbatim} ! 885: \verb|caller| structures are allocated at execution time ! 886: and point to \verb|callsite|s, which are generated by the front end ! 887: at compile time. ! 888: ! 889: Just before each call, the front end generates an assignment ! 890: of a pointer to a \verb|callsite| structure to the global variable \verb|_caller|. ! 891: \verb|_prologue| uses \verb|_caller| to record an edge in the dynamic call graph. ! 892: If a record of the caller already exists, its count is simply incremented. ! 893: Otherwise, a \verb|caller| structure is allocated and prefixed to ! 894: the callee's list of callers. ! 895: \begin{verbatim} ! 896: _prologue(struct func *callee) { ! 897: static struct caller callers[4096]; ! 898: static int next; ! 899: struct caller *p; ! 900: ! 901: for (p = callee->callers; p; p = p->link) ! 902: if (p->caller == _caller) { ! 903: p->count++; ! 904: break; ! 905: } ! 906: if (!p && next < sizeof callers/sizeof callers[0]) { ! 907: p = &callers[next++]; ! 908: p->caller = _caller; ! 909: p->count = 1; ! 910: p->link = callee->callers; ! 911: callee->callers = p; ! 912: } ! 913: _caller = 0; ! 914: } ! 915: \end{verbatim} ! 916: ! 917: Profiling can be restricted to only those files of interest. ! 918: The counts printed by \verb|bprint| will be correct, ! 919: but some edges may be omitted from the call graph. ! 920: For example, if \verb|f| calls \verb|g| calls \verb|h| ! 921: and \verb|f| and \verb|h| are compiled with profiling, but \verb|g| is not, ! 922: \verb|bprint| will report that \verb|f| called \verb|h|. ! 923: The total number of calls to each function is correct, however. ! 924: ! 925: \section{Performance} ! 926: ! 927: \verb|lcc| emits local code that is comparable ! 928: to that emitted by the generally available alternatives. ! 929: Table~\ref{spec:execution} summarizes the results of compiling ! 930: and executing the C programs in the SPEC benchmarks~\cite{spec89} ! 931: with three compilers on the four machines listed above. ! 932: Configuration details are listed with each machine. ! 933: \verb|cc| and \verb|gcc| denote, respectively, ! 934: the manufacturer's C compiler and the ! 935: GNU~C compiler from the Free Software Foundation. ! 936: The times are elapsed time in seconds and are the lowest ! 937: elapsed times over several runs on lightly loaded machines. ! 938: All reported runs achieved at least 97 percent utilization (i.e., the ratio of times ! 939: $({\it user} + {\it system})/{\it elapsed} \ge 0.97$). ! 940: ! 941: The entries with \verb|-O| indicate compilation with the ! 942: ``default'' optimization, which often includes some global optimizations. ! 943: \verb|lcc| performs no global optimizations. ! 944: The \verb|gcc| and \verb|gcc -O| figures for {\tt gcc1.35} on the MIPS ! 945: are missing because this benchmark did not execute correctly ! 946: when compiled with \verb|gcc|. ! 947: ! 948: \newcommand{\0}{\hspace{0.5em}} ! 949: \begin{table} ! 950: \begin{center} ! 951: \begin{tabular}{lcccc} ! 952: &\multicolumn{4}{c}{\it benchmark}\\ ! 953: \it compiler & 1. \tt gcc1.35& 8. \tt espresso& 22. \tt li & 23. \tt eqntott \\ \hline ! 954: %\makebox[0pt][l]{VAX: VAX 8650 w/36MB running 4.3BSD UNIX} \\% % megaron ! 955: % v1.3 ! 956: %\tt lcc & 345 & 588 & 1315 & 296 \\ ! 957: %\tt cc & 350 & 504 & 1471 & 334 \\ ! 958: %\tt gcc & 304 & 501 & 1316 & 327 \\ %v1.36 ! 959: %\tt cc -O & 320 & 525 & 1486 & 325 \\ ! 960: %\tt gcc -O & 283 & 466 & 1272 & 201 \\[2ex] ! 961: \\[-.5ex] ! 962: \makebox[0pt][l]{VAX: \small MicroVAX II w/16MB running Ultrix 3.1} \\% v1.3; ffserver ! 963: \tt lcc & 1734 & 2708 & 7015 & 3532 \\ ! 964: \tt cc & 1824 & 2782 & 7765 & 3569 \\ ! 965: \tt gcc & 1439 & 2757 & 7353 & 3263 \\ %v1.36 ! 966: \tt cc -O & 1661 & 2653 & 7086 & 3757 \\ ! 967: \tt gcc -O & 1274 & 2291 & 6397 & 1131 \\[2ex] ! 968: ! 969: \makebox[0pt][l]{68020: \small Sun 3/60 w/24MB running SunOS 4.0.3} \\% v1.3; hemlock ! 970: \tt lcc & 544 & 1070 & 2591 & 567 \\ ! 971: \tt cc & 514 & 1005 & 3308 & 619 \\ ! 972: \tt gcc & 426 & 1048 & 2498 & 591 \\ %v1.35 ! 973: \tt cc -O & 428 &\0882 & 2237 & 571 \\ ! 974: \tt gcc -O & 337 &\0834 & 1951 & 326 \\[2ex] ! 975: ! 976: \makebox[0pt][l]{MIPS: \small IRIS 4D/220GTX w/32MB running IRIX 3.3.1} \\% v1.3; oyoy ! 977: \tt lcc & 116 & 150 & 352 & 111 \\ ! 978: \tt cc & 107 & 153 & 338 & 100 \\ ! 979: \tt gcc & & 188 & 502 & 132 \\ %v1.36 ! 980: \tt cc -O &\092 & 130 & 299 &\070 \\ ! 981: \tt gcc -O & & 145 & 411 & 112 \\[2ex] ! 982: ! 983: %\makebox[0pt][l]{SPARC:\hspace{1em}}% phoenix: Sun 4/490 w/128MB, SunOS 4.1 ! 984: %\tt lcc & 106 & 218 & 436 & 129 \\ ! 985: %\tt cc &\099 & 189 & 589 & 133 \\ ! 986: %\tt gcc &\094 & 205 & 589 & 131 \\ %v1.36 ! 987: %\tt cc -O &\078 & 154 & 351 &\089 \\ ! 988: %\tt gcc -O &\064 & 153 & 384 &\092 \\ ! 989: % ! 990: \makebox[0pt][l]{SPARC: \small Sun 4/260 w/32MB running SunOS 4.0.3}\\% v1.3; python ! 991: \tt lcc & 196 & 370 &\0790 & 209 \\ ! 992: \tt cc & 203 & 381 & 1094 & 275 \\ ! 993: \tt gcc & 186 & 411 & 1139 & 256 \\ %v1.34 ! 994: \tt cc -O & 150 & 296 &\0707 & 183 \\ ! 995: \tt gcc -O & 127 & 309 &\0788 & 179 \\ ! 996: \end{tabular} ! 997: \end{center} ! 998: \caption{Execution Time for C SPEC Benchmarks in Seconds.\label{spec:execution}} ! 999: \end{table} ! 1000: ! 1001: \verb|lcc| is faster than many (but not all~\cite{thompson90}) other C compilers. ! 1002: Table~\ref{spec:compilation} parallels Table~\ref{spec:execution}, ! 1003: but shows compilation time instead of execution time. ! 1004: Except for the MIPS, the times are for running only the compiler proper; ! 1005: preprocessing, assembly, and linking time are not included. ! 1006: Two times are given for the MIPS because the manufacturer's \verb|cc| front end ! 1007: consists of two programs; the first translates C to ``u-code'' and the ! 1008: second generates object code. Generating assembly language ! 1009: costs {\em more} than generating object code, so Table~\ref{spec:compilation} gives ! 1010: both times for all compilers. ! 1011: The last row in Table~\ref{spec:compilation} lists the number of non-blank ! 1012: lines and the total number of lines in each benchmark {\em after} preprocessing. ! 1013: ! 1014: \begin{table} ! 1015: \begin{center} ! 1016: \begin{tabular}{lcccc} ! 1017: &\multicolumn{4}{c}{\it benchmark}\\ ! 1018: \it compiler & 1. \tt gcc1.35& 8. \tt espresso& 22. \tt li & 23. \tt eqntott \\ \hline ! 1019: %\makebox[0pt][r]{VAX:\hspace{1em}}% megaron ! 1020: %\tt lcc & 189 &\059 & 31 &\09 \\ ! 1021: %\tt cc & 405 & 114 & 40 & 26 \\ ! 1022: %\tt gcc & 505 & 222 & 74 & 45 ! 1023: \\[-.5ex] ! 1024: \makebox[0pt][l]{VAX:}\\% v1.3; ffserver ! 1025: \tt lcc &\0792 & 237 &\069 & 36 \\ ! 1026: \tt cc & 1878 & 576 & 174 & 79 \\ ! 1027: \tt gcc & 1910 & 637 & 192 & 86 \\[2ex] ! 1028: ! 1029: \makebox[0pt][l]{68020:}\\% v1.3; hemlock ! 1030: \tt lcc & 302 &\090 & 28 & 15 \\ ! 1031: \tt cc & 507 & 168 & 52 & 29 \\ ! 1032: \tt gcc & 599 & 196 & 56 & 27 \\[2ex] ! 1033: ! 1034: \makebox[0pt][l]{MIPS:}\\% v1.3; oyoy ! 1035: \tt lcc &\097 195 &\035 \063 & 10 24 &\06 16\\ ! 1036: \tt cc & 318 177 & 104 \068 & 40 26 & 24 19 \\ ! 1037: \tt gcc & 320 391 &\088 118 & 28 42 & 13 24 \\[2ex] ! 1038: ! 1039: \makebox[0pt][l]{SPARC:}\\% v1.3; python ! 1040: \tt lcc & 103 &\038 & 12 & \08 \\ ! 1041: \tt cc & 175 &\060 & 18 & 11 \\ ! 1042: \tt gcc & 313 & 100 & 31 & 16 \\[2ex] ! 1043: ! 1044: line counts & 79102/250496 & 25717/58516 & 7070/22494 & 2680/6569 \\ ! 1045: \end{tabular} ! 1046: \end{center} ! 1047: \caption{Compilation Time for C SPEC Benchmarks in Seconds.\label{spec:compilation}} ! 1048: \end{table} ! 1049: ! 1050: \verb|lcc| is smaller than other compilers. ! 1051: Table~\ref{sizes} lists the sizes of the three compilers in kilobytes. ! 1052: Each entry is the sum of sizes of the program and data segments ! 1053: for the indicated compiler as reported by the UNIX \verb|size| command. ! 1054: ! 1055: \begin{table} ! 1056: \begin{center} ! 1057: \begin{tabular}{lcccc} ! 1058: \it compiler & VAX & 68020 & MIPS & SPARC \\ \hline ! 1059: \tt lcc & 181 & 244 & 280 & 276 \\ %v1.3 ! 1060: \tt cc & 256 & 306 & 616 & 402 \\ ! 1061: \tt gcc & 378 & 507 & 777 & 689 \\ ! 1062: \end{tabular} ! 1063: \end{center} ! 1064: \caption{Sizes of Compiler Executables in Kilobytes.\label{sizes}} ! 1065: \end{table} ! 1066: ! 1067: %\bibliography{refs,lib} ! 1068: \begin{thebibliography}{10} ! 1069: ! 1070: \bibitem{aho:sethi:ullman:86} ! 1071: A.~V. Aho, R.~Sethi, and J.~D. Ullman. ! 1072: \newblock {\em Compilers: Principles, Techniques, and Tools}. ! 1073: \newblock Addison Wesley, Reading, MA, 1986. ! 1074: ! 1075: \bibitem{ansi:Cstandard} ! 1076: American National Standard Institute, Inc., New York. ! 1077: \newblock {\em American National Standards for Information Systems, Programming ! 1078: Language C {ANSI} X3.159--1989}, 1990. ! 1079: ! 1080: \bibitem{baskett78} ! 1081: F.~Baskett. ! 1082: \newblock The best simple code generation technique for while, for, and do ! 1083: loops. ! 1084: \newblock {\em SIGPLAN Notices}, 13(4):31--32, Apr. 1978. ! 1085: ! 1086: \bibitem{bentley82} ! 1087: J.~L. Bentley. ! 1088: \newblock {\em Writing Efficient Programs}. ! 1089: \newblock Prentice Hall, Englewood Cliffs, NJ, 1982. ! 1090: ! 1091: \bibitem{bernstein85} ! 1092: R.~L. Bernstein. ! 1093: \newblock Producing good code for the case statement. ! 1094: \newblock {\em Software---Practice \& Experience}, 15(10):1021--1024, Oct. ! 1095: 1985. ! 1096: ! 1097: \bibitem{bsd84} ! 1098: Computer Science Division, Department of Electrical Engineering and Computer ! 1099: Science, University of California, Berkeley, CA. ! 1100: \newblock {\em UNIX User's Manual, Reference Guide}, virtual {VAX}-11 version ! 1101: edition, Mar. 1984. ! 1102: ! 1103: \bibitem{fraser:sigplan89} ! 1104: C.~W. Fraser. ! 1105: \newblock A language for writing code generators. ! 1106: \newblock {\em Proceedings of the SIGPLAN'89 Conference on Programming Language ! 1107: Design and Implementation, SIGPLAN Notices}, 24(7):238--245, July 1989. ! 1108: ! 1109: \bibitem{fraser:hanson:91a} ! 1110: C.~W. Fraser and D.~R. Hanson. ! 1111: \newblock A code generation interface for {ANSI C}. ! 1112: \newblock {\em Software---Practice \& Experience}, 21(9):963--988, Sept. 1991. ! 1113: ! 1114: \bibitem{fraser:hanson:92} ! 1115: C.~W. Fraser and D.~R. Hanson. ! 1116: \newblock Simple register spilling in a retargetable compiler. ! 1117: \newblock {\em Software---Practice \& Experience}, 22(1):85--99, Jan. 1992. ! 1118: ! 1119: \bibitem{graham:kessler:mckusick:83} ! 1120: S.~L. Graham, P.~B. Kessler, and M.~K. McKusick. ! 1121: \newblock An execution profiler for modular programs. ! 1122: \newblock {\em Software---Practice \& Experience}, 13(8):671--685, Aug. 1983. ! 1123: ! 1124: \bibitem{griswold:griswold:90} ! 1125: R.~E. Griswold and M.~T. Griswold. ! 1126: \newblock {\em The Icon Programming Language}. ! 1127: \newblock Prentice Hall, Englewood Cliffs, NJ, second edition, 1990. ! 1128: ! 1129: \bibitem{hanson83} ! 1130: D.~R. Hanson. ! 1131: \newblock Simple code optimizations. ! 1132: \newblock {\em Software---Practice \& Experience}, 13(8):745--763, Aug. 1983. ! 1133: ! 1134: \bibitem{hanson90} ! 1135: D.~R. Hanson. ! 1136: \newblock Fast allocation and deallocation of memory based on object lifetimes. ! 1137: \newblock {\em Software---Practice \& Experience}, 20(1):5--12, Jan. 1990. ! 1138: ! 1139: \bibitem{kernighan:ritchie:88} ! 1140: B.~W. Kernighan and D.~M. Ritchie. ! 1141: \newblock {\em The C Programming Language}. ! 1142: \newblock Prentice Hall, Englewood Cliffs, NJ, second edition, 1988. ! 1143: ! 1144: \bibitem{knuth71} ! 1145: D.~E. Knuth. ! 1146: \newblock An empirical study of {FORTRAN} programs. ! 1147: \newblock {\em Software---Practice \& Experience}, 1(2):105--133, Apr. 1971. ! 1148: ! 1149: \bibitem{sedgewick88} ! 1150: R.~Sedgewick. ! 1151: \newblock {\em Algorithms}. ! 1152: \newblock Addison Wesley, Reading, MA, 1988. ! 1153: ! 1154: \bibitem{sites78} ! 1155: R.~L. Sites. ! 1156: \newblock Programming tools: Statement counts and procedure timings. ! 1157: \newblock {\em SIGPLAN Notices}, 13(12):98--101, Dec. 1978. ! 1158: ! 1159: \bibitem{spec89} ! 1160: Standards Performance Evaluation Corp. ! 1161: \newblock {\em {SPEC} Benchmark Suite Release 1.0}, Oct. 1989. ! 1162: ! 1163: \bibitem{thompson90} ! 1164: K.~Thompson. ! 1165: \newblock A new {C} compiler. ! 1166: \newblock In {\em Proceedings of the Summer 1990 UKUUG Conference}, pages ! 1167: 41--51, London, July 1990. ! 1168: ! 1169: \bibitem{waite86} ! 1170: W.~M. Waite. ! 1171: \newblock The cost of lexical analysis. ! 1172: \newblock {\em Software---Practice \& Experience}, 16(5):473--488, May 1986. ! 1173: ! 1174: \bibitem{weinberger84} ! 1175: P.~J. Weinberger. ! 1176: \newblock Cheap dynamic instruction counting. ! 1177: \newblock {\em Bell System Technical Journal}, 63(8):1815--1826, Oct. 1984. ! 1178: ! 1179: \end{thebibliography} ! 1180: ! 1181: \end{document}
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.