|
|
1.1 ! root 1: \documentstyle[titlepage,fullpage,12pt,clbiba]{article} ! 2: \begin{document} ! 3: \newcommand{\twigcomp}{{\tt twig}} ! 4: \newcommand{\Twigcomp}{{\tt Twig}} ! 5: \newcommand{\twiglang}{{\it twig}} ! 6: \newcommand{\Twiglang}{{\it Twig}} ! 7: \title{Twig Reference Manual} ! 8: ! 9: \date{ ! 10: \begin{minipage}{.8\textwidth} ! 11: \normalsize ! 12: \noindent ! 13: \Twiglang{} is a language for manipulating trees. A \twiglang{} ! 14: program consists of a set ! 15: of pattern-action rules together with associated declarations. Patterns ! 16: describe trees to be matched. Actions calculate costs, perform tree ! 17: manipulations and other functions such as emitting code. A \twiglang{} program is ! 18: translated by the \twigcomp{} compiler into subroutines and tables in a host ! 19: language. In the current implementation, the host language is C.\\[6pt] ! 20: A \twiglang{} program manipulates trees by first finding a minimum cost covering of ! 21: the input tree. The actions of the rules whose pattern parts composes the ! 22: covering is then executed. The minimum cost covering is determined using ! 23: dynamic programming. This technique naturally resolves many ambiguities that ! 24: may be in the specifications.\\[6pt] ! 25: The prime purpose of \twiglang{} is to create tree manipulation programs. ! 26: One interesting application of tree manipulation is code generation and ! 27: \twiglang{} has been used to implement a code generator for the ! 28: pcc2 compiler on the VAX. ! 29: \end{minipage}} ! 30: ! 31: \author{Steven W.K. Tjiang} ! 32: ! 33: \maketitle ! 34: \tableofcontents\newpage ! 35: \section{Introduction} ! 36: ! 37: \Twiglang{} is a language for writing tree manipulation programs. ! 38: A \twiglang{} program does this by using a matcher to find a minimal ! 39: cost covering of a tree. ! 40: The covering is composed of ! 41: user supplied templates. ! 42: Trees are manipulated by ! 43: executing the actions associated with the templates of their coverings. ! 44: These actions may rewrite the tree, update other data structures, or ! 45: generate output. ! 46: ! 47: Figure~\ref{overview} gives an overview of the \twiglang{} system. ! 48: The round boxes in the figure indicates files and the sharp cornered boxes ! 49: are processing programs. ! 50: The user writes the \twiglang{} specification file. ! 51: This file is then compiled by the \twigcomp{} compiler into ! 52: host language files. ! 53: In the current version of \twiglang{}, the host language is C. ! 54: The host language files are then compiled and linked with ! 55: other user supplied files. ! 56: The resulting program represented by the large box at the bottom ! 57: of figure~\ref{overview} is the tree manipulation program. ! 58: \begin{figure}[hp] ! 59: \input{pic} ! 60: \caption{The {\bf Twig } system} ! 61: \label{overview} ! 62: \end{figure} ! 63: The details of this box will be discussed later. ! 64: ! 65: \section{Motivation} ! 66: ! 67: Although \Twiglang{} is a language for manipulating trees ! 68: it was originally motivated by ! 69: pattern directed code generation. In many compilers, the typical ! 70: structure is a {\sl front end} and a {\sl back end}. The front end reads ! 71: the input source program, checks its syntax and contextual conditions, ! 72: and builds an intermediate representation of the program. ! 73: The language used to do this representation is often called the ! 74: {\it intermediate representation} and will be abbreviated as IR. ! 75: The back end ! 76: takes the IR and translates it into the target machine code. ! 77: ! 78: That isn't the whole story but it's close. Both ends may be ! 79: sets of programs rather than one ! 80: program. There may also be a ! 81: {\it middle} that performs transformations on the ! 82: IR. The intention of these ! 83: transformations is to improve the target code that comes out of the compiler. ! 84: For this discussion, we can ignore the middle. ! 85: ! 86: IRs can vary greatly from compiler to compiler. ! 87: However the IRs are implemented, they are encodings of graphs. ! 88: Usually three types of graphs are evident\cite{dragon}. ! 89: There is a flowgraph whose nodes are {\sl basic blocks} and ! 90: the edges corresponds to transfer of control between the basic blocks. ! 91: Basic blocks are not featureless. They are graphs themselves. ! 92: Some compilers will represent them as a forest of ! 93: {\it directed acyclic graphs} or DAGs. ! 94: The nodes are data values and operations. ! 95: Since operations yields values, one can think of nodes as ! 96: corresponding to values. An edge exists from node $a$ to ! 97: node $b$ if and only if the value corresponding to $a$ depends on the ! 98: value corresponding to $b$. Other compilers will use trees instead of DAGs. ! 99: Trees are not ! 100: as general as DAGs and these compilers overcome that by using some ! 101: type of labelling or copying. ! 102: ! 103: Writing a front end usually ! 104: involves writing a recognizer ! 105: for the source language, designing a symbol table, ! 106: and determining an error recovery strategy. What has made this process easy ! 107: are parser generators like YACC\cite{yacc}. ! 108: They separate out the task of recognizing the ! 109: source language from the other details. This makes the task ! 110: intellectually more manageable. ! 111: The compiler implementor now specifies the source language via grammar ! 112: productions. Symbol table manipulations, ! 113: and IR construction rules are attached to productions and activated ! 114: when a source language construct is recognized. ! 115: The front end is then {\it automatically} generated by processing the ! 116: specification with a parser generator. ! 117: The result is ! 118: a high quality front end whose performance is close to that of a ! 119: hand-crafted front end. ! 120: ! 121: Providing the same type of facilities for automatically building back ! 122: ends has been much more difficult. Extending current front end ! 123: parsers to do back ends ! 124: has been workable but not entirely successful. The basic idea is that ! 125: the back end implementor writes a {\it machine grammar} ! 126: for the IR and code generation ! 127: occurs as actions that are executed when patterns are found ! 128: in the IR\cite{gana,henry}. ! 129: Unfortunately, these grammars are large and ambiguous. Using ! 130: parsers to handle them have usually met with many combinatorial problems ! 131: \cite{henry}. Furthermore, these code generators are slow ! 132: although their output is often of high quality. ! 133: The software engineering advantages one see with parsers in the front end ! 134: are not so evident in this circumstance. Instead of having to deal with ! 135: the complexities of code generation, the implementor now has to deal with ! 136: the complexities of the very large machine grammar. ! 137: ! 138: Another approach at automatically building back ends is to start with ! 139: known good algorithms for generating code. From these algorithms, we ! 140: can build an abstract interpreter and then specifications can be written ! 141: to drive them. We hope that by doing this we can derive a much more natural ! 142: code generation specification technique. ! 143: \Twiglang{} is one possible result of this approach. The algorithms ! 144: that form the basis of \twiglang{} are presented in \cite{optimalcode} and ! 145: \cite{treematch}. \Twiglang{} ! 146: specifications are much less complex than parser based ! 147: specifications. Ambiguities are handled naturally and do not complicate ! 148: the specifications. ! 149: ! 150: In \cite{optimalcode}, it is shown how optimal code can be generated ! 151: for an expression tree ! 152: given a description of a machine's instruction set. ! 153: This description is in term of tree templates. ! 154: The algorithm first determines a minimal cost covering of ! 155: an expression tree using dynamic programming. ! 156: The covering is composed from the templates in the ! 157: description. The code generated is then the instructions ! 158: represented by the templates in the covering. ! 159: This algorithm was shown to generate the shortest code sequences for an ! 160: abstract machine and ! 161: its running time was shown to be linear in the number of nodes in the ! 162: expression tree in \cite{optimalcode}. ! 163: ! 164: This is all very well in theory but what about in practice. ! 165: Code length is not always the quantity that one wishes to minimize. ! 166: One might wish to minimize the number of memory accesses or machine cycles ! 167: executed by the generated code. ! 168: Fortunately, the algorithm can also minimize these quantities. ! 169: In fact, it can minimize any strictly increasing cost function. ! 170: Let $I$ denote a machine instruction and ! 171: $I_0I_1I_2...I_n$ denote a sequence of instructions. ! 172: Then a cost function $f$ is strictly increasing if and only if ! 173: $f(I_0I_1I_2...I_n) < f(I_0I_1I_2...I_nI_{n+1})$ for all ! 174: instruction sequences $I_0I_1I_2...I_n$ and instruction ! 175: $I_{n+1}$. Another shortcoming in \cite{optimalcode} ! 176: is that no efficient algorithm is given for matching the templates ! 177: in the tree. This is where \cite{treematch} comes in. In that paper, ! 178: two algorithms are presented for tree matching. \Twiglang{} combines ! 179: the work presented in the two papers. ! 180: ! 181: \Twiglang{} differs from the algorithm as presented in \cite{optimalcode} in ! 182: the following ways: ! 183: \begin{itemize} ! 184: \item ! 185: It provides a convenient specification language for the templates. ! 186: \item ! 187: Tree templates in \cite{optimalcode} were labelled with either {\sl Register} or ! 188: {\sl Memory}. This label being the storage class that the instruction corresponding ! 189: to the template would put a result. ! 190: \Twiglang{} generalize tree labels so that ! 191: they can be any symbol. ! 192: \item ! 193: Instead of associating an instruction with each template, \twiglang{} associates ! 194: an action. ! 195: \item ! 196: In \cite{optimalcode}, code is generated by a bottom up traversal of the ! 197: minimum cost covering. \Twiglang{} allows a top down traversal of the cover ! 198: and also the ability for the actions to rewrite parts of the ! 199: tree. ! 200: \item ! 201: The algorithm in \cite{optimalcode} determines the evaluation order ! 202: which gives the optimal code. ! 203: \Twiglang{} does not do this. ! 204: The evaluation order determination was omitted based on the conjecture that ! 205: most of the time, simple left to right evaluation of the leaves will yield ! 206: near optimal results. ! 207: In situations where a nonstandard evaluation order is required, a new ! 208: \twiglang{} pattern could be added that codes the order explicitly ! 209: (see Section~\ref{exec:phase}). ! 210: \end{itemize} ! 211: ! 212: In this manual, a brief discussion will be given of the \twiglang{} language ! 213: and how \twiglang{} programs can be incorporated with C. ! 214: ! 215: Many statements have been made in this manual without proof. ! 216: If the reader wishes to pursue it further, more details about the algorithms ! 217: will be presented in an upcoming paper, ! 218: \cite{aho:gana:tjiang}. ! 219: ! 220: \section{The \Twiglang{} Language} ! 221: \subsection{Lexical Issues and Conventions} ! 222: Currently the ! 223: following identifiers are reserved keywords of \twiglang{}: ! 224: ! 225: \begin{tabbing} ! 226: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\kill ! 227: action \> cost \> insert \\ ! 228: label \> node \> prologue ! 229: \end{tabbing} ! 230: ! 231: \noindent ;:(),= are special characters. All blanks, tabs, formfeeds and ! 232: newlines are ignored by \twiglang{} but they may not appear inside ! 233: identifiers and numbers. Identifiers are nonempty strings ! 234: made of letters, digits or underscores and starting with a letter. ! 235: Numbers are nonempty string of digits. Code fragments or action parts ! 236: are enclosed in braces. Inside code fragments, C lexical rules must ! 237: be obeyed except that strings of the form $\$...\$$ that are not inside C ! 238: strings have special meaning to \twiglang{}. In the following sections, $id$ ! 239: denotes an identifier; $int_1$, $int_2$ denotes numbers; $ccode$ ! 240: denotes C code fragments; $...$ indicates repetition of the previous ! 241: item; $[...]$ indicates that $...$ is optional. ! 242: ! 243: The input to the \twiglang{} program will be referred to as the subject tree. ! 244: ! 245: \subsection{Rules} ! 246: ! 247: The fundamental unit of a \twiglang{} program is a rule. ! 248: ! 249: \begin{verse} ! 250: $label\_id$ {\bf :} $tree\_pattern$\quad {\bf [} $cost$ {\bf ]} ! 251: {\bf [ =} $action$ {\bf ]} ! 252: \end{verse} ! 253: ! 254: \noindent Intuitively, the pattern is used to match a subtree. ! 255: The $cost$ code ! 256: fragment is then evaluated. The resulting cost is recorded by ! 257: the matcher for use in dynamic programming. The $action$ is executed ! 258: if the rule is part of the least cost cover of the tree. ! 259: The details of pattern matching will be discussed in Section~\ref{pattern}. ! 260: ! 261: If the $cost$ part is missing, \twigcomp{} will insert default code ! 262: that returns the special value, {\tt DEFAULT\_COST}. ! 263: A missing $action$ part indicates that nothing will be done when a ! 264: match is found. ! 265: ! 266: \subsection{Tree Patterns} ! 267: \label{TreePatterns} ! 268: Tree patterns are specified ! 269: in prefix parenthesized form and can be described by the ! 270: following BNF: ! 271: ! 272: \begin{verse} ! 273: $tree\_pattern\to node\_id\;\vert\;label\_id$\\ ! 274: $tree\_pattern\to node\_id \;{\bf (}\; tree\_list \;{\bf )}$\\ ! 275: $tree\_list \to tree \;{\bf ,}\; tree\_list \;\vert\; tree$\\ ! 276: \end{verse} ! 277: ! 278: \begin{figure} ! 279: \unitlength 0.1in ! 280: \begin{center} ! 281: \begin{picture}(25,16) ! 282: \put(7,0){\framebox(4,3){\tt id}} ! 283: \put(20,0){\framebox(5,3){\tt const}} ! 284: \put(13,6){\framebox(5,3){\tt plus}} ! 285: \put(6,12){\framebox(5,3){\tt plus}} ! 286: \put(0,6){\framebox(4,3){\tt id}} ! 287: \put(11,3){\line(2,3){2}} ! 288: \put(20,3){\line(-2,3){2}} ! 289: \put(13,9){\line(-2,3){2}} ! 290: \put(4,9){\line(2,3){2}} ! 291: \end{picture} ! 292: \end{center} ! 293: \vspace{0.5in} ! 294: \begin{verse} ! 295: \begin{center} ! 296: \tt plus(id, plus(id, const)) ! 297: \end{center} ! 298: \end{verse} ! 299: \caption{A Tree and its prefixed form} ! 300: \label{treepattern} ! 301: \end{figure} ! 302: \noindent Figure~\ref{treepattern} gives an example of a tree pattern ! 303: and its prefix parenthesized form. ! 304: ! 305: There are two types of symbols: $node\_id$\/s and $label\_id$\/s. ! 306: $Node\_id$\/s are used to denote internal nodes and leaves. ! 307: $Label\_id$\/s label tree patterns and are analogous to ! 308: nonterminals in context free grammars. ! 309: For example, the following \twiglang{} rules without their action parts describe ! 310: simple expression trees with the {\tt plus} operator. ! 311: ! 312: \begin{verse} ! 313: {\tt\begin{tabbing} ! 314: expr: \= plus(expr, expr)\\ ! 315: expr: \> identifier\\ ! 316: expr: \> constant ! 317: \end{tabbing}} ! 318: \end{verse} ! 319: ! 320: \noindent Here, {\tt identifier} and {\tt constant} are $node\_id$\/s ! 321: representing leaves, and ! 322: {\tt plus} is a $node\_id$ representing an internal node whereas ! 323: {\tt expr} is a ! 324: $label\_id$. Pattern leaves that are $label\_id$\/s are called ! 325: {\sl labelled leaves}. ! 326: In the first rule, both leaves of the pattern are labelled. ! 327: ! 328: \Twigcomp{} associates an integer with each node symbol and label symbol. These ! 329: integers are used by the \twiglang{} pattern matcher as encodings for the ! 330: symbols. ! 331: As the matcher traverses the ! 332: tree, a user supplied subroutine is called to provide an integer ! 333: corresponding to each node. ! 334: ! 335: \subsection{Costs} ! 336: ! 337: To increase the flexibility of representing costs, the tree matcher ! 338: views costs as an {\sl abstract data type} or ADT. ! 339: For example, costs may be represented as an integer or ! 340: as a vector of integers with each element representing the cost of a specific ! 341: resource. ! 342: A cost ADT suitable for \twiglang{} must have the following four definitions: ! 343: ! 344: \begin{itemize} ! 345: \item {\tt COST} is a C type. ! 346: Although the proper functioning of the tree matcher ! 347: does not depend on the internal details of the {\tt COST} type, it must have ! 348: the type information for storage allocation purposes. ! 349: \item {\tt INFINITY} is the maximum attainable cost value. ! 350: The matcher uses {\tt INFINITY} to initialize its data structures. ! 351: \item {\tt DEFAULT\_COST} is the cost ! 352: value returned by rules without a cost part. ! 353: \item ! 354: {\tt COSTLESS} is a function of ! 355: two cost values. It must return true if and only if the first argument is ! 356: less than the second. ! 357: \end{itemize} ! 358: ! 359: \subsection{Trees} ! 360: ! 361: As with costs, trees are treated as an ADT. Here, using an ADT is ! 362: even more important because trees come in a variety of shapes and ! 363: representations. ! 364: \Twiglang{} would be overly complicated if it had to ! 365: know any more than the fundamental properties of trees. Thus, ! 366: \twiglang{} treats trees as an acyclic directed ! 367: graph of almost featureless nodes with one ! 368: distinguished node, the root. Each node has only one feature and that ! 369: is an integer corresponding to the $node\_id$\/s discussed above. ! 370: To provide this ! 371: view to \twiglang{}, the user must provide the following definitions and ! 372: subroutines. ! 373: ! 374: \begin{itemize} ! 375: \item {\tt NODEPTR} is the type of a node. The actual ! 376: details of ! 377: the {\tt NODEPTR} ! 378: are irrelevant and \twigcomp{} uses this definition only for storage ! 379: allocation and declaration purposes. ! 380: \item ! 381: {\tt NULL} is used to denote a null {\tt NODEPTR} value. ! 382: In the current implementation this is the same {\tt NULL} used by the ! 383: C standard I/O package and need not be defined by the user. ! 384: This restricts the {\tt NODEPTR} representation to be a pointer. ! 385: \item ! 386: {\tt mtGetNodes$(r,n)$} returns the $n$th child of $r$ where $r$ ! 387: is a {\tt NODEPTR}. The ! 388: leftmost child is {\tt mtGetNodes}$(r,1)$. If $n > \hbox{degree}(r)$ ! 389: then the function should return {\tt NULL}. \Twigcomp{} expects the ! 390: expression {\tt mtGetNodes}$({\tt NULL}, 0)$ ! 391: to denote the root node of ! 392: the subject tree. ! 393: \item ! 394: {\tt mtValue$(r)$} returns the integer associated with $r$ ! 395: (See section~\ref{TreePatterns}). ! 396: \item ! 397: {\tt mtSetNodes$(r,n,c)$} replaces the $n$th child of $r$ with ! 398: $c$. This routine may be used to replace whole subtrees with others. ! 399: {\tt mtSetNodes}$({\tt NULL},0,c)$ is used by \twigcomp{} to set the ! 400: root of the subject tree to $c$. ! 401: \end{itemize} ! 402: ! 403: \subsection{Prologue and Inserts} ! 404: ! 405: \begin{verse} ! 406: {\tt prologue} $ccode$; ! 407: \end{verse} ! 408: ! 409: \noindent signals to \twigcomp{} that $ccode$ should be inserted at the ! 410: beginning of the output source file. $Ccode$ should contain ! 411: definitions relevant to the C code in ! 412: rules that are defined later in the \twiglang{} source file. ! 413: ! 414: \begin{verse} ! 415: {\tt insert} $ccode$; ! 416: \end{verse} ! 417: ! 418: \noindent causes $ccode$ to be placed into the source file. ! 419: There can be multiple inserts and they will be placed into the source ! 420: file ! 421: in the order that they appear. ! 422: ! 423: \subsection{Declarations} ! 424: ! 425: All \twiglang{} identifiers are declared before they are used. ! 426: ! 427: \begin{verse} ! 428: {\tt node $id[(int_1)] [= int_2] ...$;\hfil} ! 429: \end{verse} ! 430: ! 431: \noindent A node declaration declares one or more identifier to be ! 432: associated with nodes of the subject tree. The identifiers are ! 433: assigned numbers by \twigcomp{} but the user can overide the assigned number ! 434: by specifying $int_2$. The degree of the node identifer, the ! 435: number of children, is assumed to be fixed. Normally, \twigcomp{} will ! 436: deduce the degree when $id$ is used in a rule. However, the user may ! 437: explicitly give the degree by specifying $int_1$. ! 438: ! 439: \begin{verse} ! 440: {\tt label $id ...$;} ! 441: \end{verse} ! 442: ! 443: A label declaration indicates that the following $id$'s are to be used ! 444: as labels of rules. ! 445: ! 446: \subsection{Costs and Action code} ! 447: ! 448: Code fragments such as the $cost$ and $action$ clauses of a rule are ! 449: specfied by enclosing C code in braces. All legal C constructs are ! 450: permitted in code fragments. In addition, the following are provided ! 451: for convenient access to the subject tree and internal data ! 452: structures of the matcher. ! 453: ! 454: \begin{itemize} ! 455: \item $\$\%n\$$ denotes a pointer value to the matcher data ! 456: structure for the $n$th labelled leaf. Thus to access the cost value ! 457: associated with that leaf, the notation $\$\%n\$\to${\tt cost} may be used. ! 458: ! 459: \item $\$\$$ denotes a pointer value to the root of the subject ! 460: tree. ! 461: ! 462: \item $\$n_1.n_2.n_3.\, \hbox{...}\,.n_{k-1}.n_k\$$ denotes ! 463: a pointer value to the $n_k$th child of the $n_{k-1}th$ child of the ! 464: $n_{k-2}$th child of \hbox{...} of the $n_1$th child of the root of ! 465: the subject tree. Each $n_i$ is a positive non-zero integer. ! 466: \end{itemize} ! 467: ! 468: Some special constructs are available to code fragments in the cost ! 469: part of the specification. The ! 470: statement ``{\tt ABORT;}'' when encountered during the execution of ! 471: the cost code, signals to the matcher that this pattern is to be ! 472: rejected. When a ``{\tt REWRITE;}'' statement is encountered, control ! 473: is returned to the matcher and the rule will become a {\sl rewrite} ! 474: mode match. When the end of the cost code fragment is reached, ! 475: control is returned to the matcher and the rule becomes a {\sl defer} ! 476: mode match. ! 477: The statement ``{\tt TOPDOWN;}'' is like ``{\tt REWRITE;}'' except that ! 478: the mode of the match becomes {\sl topdown}. The meanings of these modes ! 479: will be explained in Section~\ref{pattern}. ! 480: ! 481: Cost values are returned to the matcher by assigning to ! 482: the ``{\tt cost}'' variable in the cost clause. If no assignment is ! 483: made to the {\tt cost} variable then the returned cost will be {\tt ! 484: DEFAULT\_COST}. ! 485: ! 486: Action clauses are expected to return a new tree which will replace ! 487: the subject tree. This is done by returning using the ! 488: ``{\tt return($new\_tree$);}'' statement. ! 489: where {\tt $new\_tree$} is of type ! 490: {\tt NODEPTR}. ! 491: If execution reaches the end of the action clause, the ! 492: matcher resumes execution and the subject tree is not modified. ! 493: ! 494: \section{Pattern Matcher Operation} ! 495: \label{pattern} ! 496: ! 497: The ! 498: pattern matcher operates in two phases: the costing phase and the execution ! 499: phase. ! 500: During ! 501: the costing phase, a minimal cost covering of the subject tree is found. ! 502: The execution phase invokes the actions that are associated with the patterns ! 503: making up the covering. Execution phases may begin during the costing phase ! 504: to execute the covering of a subtree as described in Section~\ref{exec:phase}. ! 505: ! 506: \subsection{The Costing Phase} ! 507: \label{cost:phase} ! 508: ! 509: Let the patterns of the \twiglang{} specifications be interpreted as grammar ! 510: productions with {\sl label\_id}s as non-terminals and {\sl node\_id}s ! 511: as terminals. ! 512: If we add a production ! 513: $S\,\to\,\hbox{\sl label\_id}$ for all {\sl label\_id}s where $S$ is the ! 514: start symbol, ! 515: then a covering for a tree is analogous to a derivation ! 516: for the prefixed form of the tree. ! 517: Figure~\ref{gram:interp} shows some patterns and its corresponding grammar. ! 518: \begin{figure} ! 519: \begin{center} ! 520: \begin{verse} ! 521: \tt ! 522: e: plus(e,e)\\ ! 523: e: plus(e, plus(e,e))\\ ! 524: e: id\\ ! 525: e: const\\ ! 526: \end{verse} ! 527: \end{center} ! 528: \vspace{0.3in} ! 529: \begin{center}\begin{verse} ! 530: $S\to e$\\ ! 531: $e\to plus(e,e)$\\ ! 532: $e\to plus(e,plus(e,e)$\\ ! 533: $e\to id$\\ ! 534: $e\to const$\\ ! 535: \end{verse}\end{center} ! 536: \caption{Pattern and Grammar} ! 537: \label{gram:interp} ! 538: \end{figure} ! 539: The productions of the form $S\to\hbox{\sl label\_id}$ reflect the fact that ! 540: any {\sl label\_id} may start a cover. ! 541: ! 542: The matcher finds the least cost cover by doing a preorder traversal of ! 543: the subject tree. ! 544: At the same time, it ! 545: builds a skeleton tree that is structurally isomorphic to the subject tree. ! 546: When a match is discovered the cost clause of the pattern is invoked to ! 547: calculate the cost. Many patterns with different labels could match at any ! 548: given node but only the lowest cost pattern for each label is recorded in ! 549: the skeleton. ! 550: ! 551: When a pattern is matched, its label is then used as input to the pattern ! 552: matcher so that matching of patterns with labelled leaves can begin. ! 553: This process ! 554: is analogous to a reduce action in ! 555: bottom up parsers. ! 556: ! 557: \subsection{The Execution Phase} ! 558: \label{exec:phase} ! 559: ! 560: The execution phase starts when the costing phase is complete or when ! 561: a sufficiently low cost {\sl rewrite} mode rule is encountered. ! 562: Let $M$ be a matching ! 563: rule and $L_1, L_2, L_3, ... ,L_n$ be ! 564: the labelled leaves of the matching pattern. ! 565: The following procedure is used to execute the action parts of $M$. ! 566: ! 567: \vbox{ ! 568: \vspace{2ex} ! 569: \noindent {\bf Procedure \it Execute} ! 570: ! 571: \begin{itemize} ! 572: \item If $M$ is a {\sl deferred} mode match then ! 573: execution occurs by first applying {\it Execute} to the $L_i$s ! 574: from left to right. That is, in the order that they appear in ! 575: the prefix form of the tree. ! 576: When all the $L_i$s are executed, the action part of $M$ is ! 577: invoked. ! 578: ! 579: \item If $M$ is a {\sl rewrite } mode match then ! 580: just the action part of $M$ is executed. When the action part returns ! 581: the matcher deletes the skeleton corresponding to the subtree that ! 582: $M$ matched and rescans the new subtree that may have been substituted ! 583: by executing $M$. ! 584: ! 585: \item If $M$ is a {\sl topdown} mode match then only ! 586: the action part of $M$ is executed. To execute a ! 587: labelled leaves, $L_i$, $M$'s action part may do so explicitly ! 588: by using the {\tt tDO($\$\%i\$$)} macro. This allows ! 589: the user to arbitrary order the execution of the labelled leaves. ! 590: \end{itemize} ! 591: } ! 592: ! 593: \noindent When the costing phase is over, the execution phase is started ! 594: by picking the lowest cost match at the root of the subject tree. ! 595: This will be the root of the minimum cost cover. ! 596: The match is executed as described above. ! 597: ! 598: The execution phase may begin before the costing phase is over. ! 599: This occurs when a rewrite rule matches and its cost is lower than ! 600: all other matches at that subtree. In this situation the rewrite rule ! 601: is executed immediately and the new rewritten subtree is scanned once more. ! 602: The prescence of rewrite rules throws a wrench in the theoretical niceties ! 603: of the matching algorithm. For example, there is now no guarantee ! 604: that the algorithm will terminate because the tree can ! 605: be repeatedly enlarged. ! 606: Rewrite rules can be dangerous. They can ! 607: be triggered unintentionally if the user is not careful to abort them ! 608: in situations where they are not wanted. ! 609: However, if used with care they can help to reduce the size of the \twiglang{} ! 610: specification. ! 611: ! 612: \section{Some Examples} ! 613: ! 614: \subsection{Expression Trees to Prefix} ! 615: A \twiglang{} program to print out the prefix form of expression ! 616: parse trees ! 617: is shown in Figure~\ref{trees:prefix}. ! 618: ! 619: \begin{itemize} ! 620: ! 621: \item The rules do not have cost calculations. ! 622: Since there are no ! 623: ambiguities costs are not necessary. ! 624: ! 625: \item The second rule is a topdown mode rule. This is essential in printing ! 626: the prefix form. If it was omitted the postfix form would be printed. ! 627: ! 628: \item Before any matching can begin {\tt \_matchinit} must be called. ! 629: Trees can then be processed by calling {\tt \_match} after arranging that ! 630: the call ! 631: {\tt mtGetNodes}$({\tt NULL},0)$ returns the root of the subject tree. ! 632: \end{itemize} ! 633: ! 634: \begin{figure}[tp] ! 635: \begin{tabbing} ! 636: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\= ! 637: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\kill ! 638: prologue \> $\{$\\[4pt] ! 639: typedef \> struct node $*$\>\>{\tt NODEPTR};\\ ! 640: \#define \> {\tt COST} \>\> int\\ ! 641: \#define \> {\tt INFINITY}\>\> 100000\\ ! 642: \#define \> {\tt DEFAULT\_COST} \>\> 0\\[4pt] ! 643: {\tt NODEPTR} Root;\\ ! 644: struct code \> $\{$\\ ! 645: \> char op; \> \> $/*$ null if node is a leaf $*/$\\ ! 646: \> NODEPTR left, right;\\ ! 647: \> char $*$id;\\ ! 648: $\}$;\\[4pt] ! 649: $\}$;\\[6pt] ! 650: node \> nOp nIdent;\\ ! 651: label \> lExpr;\\[6pt] ! 652: lExpr:\> nIdent\\ ! 653: \>= $\{$ printf(``$\%$s'', id); $\}$;\\[4pt] ! 654: lExpr:\>nOp(lExpr,lExpr)\\ ! 655: \>$\{$ {\tt TOPDOWN};$\}$\\ ! 656: \>= $\{$\\ ! 657: \>\>putchar($\$\$\to$op);\\ ! 658: \>\>tDO($\$\%1\$$);\\ ! 659: \>\>tDO($\$\%2\$$);\\ ! 660: \>$\}$;\\[6pt] ! 661: insert\>$\{$\\[4pt] ! 662: mtValue(root)\\ ! 663: \>{\tt NODEPTR} root;\\ ! 664: $\{$\\ ! 665: \>if(root$\to$op==0)\\ ! 666: \>\>return(nIdent);\\ ! 667: \>else\\ ! 668: \>\>return(nOp);\\ ! 669: $\}$\\[4pt] ! 670: \end{tabbing} ! 671: \caption{Printing Expression trees to Prefix form} ! 672: \end{figure} ! 673: \addtocounter{figure}{-1} ! 674: \begin{figure}[tp] ! 675: \begin{tabbing} ! 676: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\= ! 677: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\kill ! 678: mtGetNodes(r,n)\\ ! 679: \>{\tt NODEPTR} r;\\ ! 680: \>int n;\\ ! 681: $\{$\\ ! 682: \>if(n==1)\\ ! 683: \>\>return(r$\to$left);\\ ! 684: \>else if(n==1)\\ ! 685: \>\>return(r$\to$right);\\ ! 686: \>else if(n==0 $\&\&$ r=={\tt NULL})\\ ! 687: \>\>return(Root);\\ ! 688: \>else return({\tt NULL});\\ ! 689: $\}$\\[4pt] ! 690: mtSetNodes(r,n, c)\\ ! 691: \>{\tt NODEPTR} r, c;\\ ! 692: \>int n;\\ ! 693: $\{$\\ ! 694: \>if(n==1)\\ ! 695: \>\>r$\to$left = c;\\ ! 696: \>else if(n==2)\\ ! 697: \>\>r$\to$right = c;\\ ! 698: \>else if(n==0)\\ ! 699: \>\>Root = c;\\ ! 700: $\}$\\[4pt] ! 701: main(argc,argv)\>\>\>$/*$ called by user only $*/$\\ ! 702: \>char $**$argv;\\ ! 703: $\{$\\ ! 704: \>\_matchinit();\>\>$/*$ initialize the matcher $*/$\\ ! 705: \>$...$ {\sl get a tree and set Root to it}\\ ! 706: \>\_match();\>\>$/*$ do the match $*/$\\ ! 707: $\}$\\ ! 708: $\}$\\ ! 709: \end{tabbing} ! 710: \caption{(cont.) Printing Expression trees to Prefix form} ! 711: \label{trees:prefix} ! 712: \end{figure} ! 713: ! 714: \subsection{Short Circuited Boolean Evaluation} ! 715: ! 716: Short circuited boolean evaluation is naturally topdown. ! 717: A fragment of \twiglang{} program that implements this ! 718: is shown in Figure~\ref{short:circuit}. ! 719: ! 720: \begin{itemize} ! 721: ! 722: \item Labels for true and false branches are passed down to ! 723: rules below by putting them into the nodes. Another way of ! 724: accomplishing this is to use an auxillary argument stack. ! 725: ! 726: \item The rules assume that every test generates a branch to both ! 727: the false and true label. On conventional machines, extra branches ! 728: can be eliminated by recognizing that one could generate code that ! 729: falls through to its true or false labels. This can be done by ! 730: separating {\sl lTest} into {\sl lTrueTest} and {\sl lFalseTest} which ! 731: branches only when true and false respectively. Rules can then be ! 732: written to take advantage of this. ! 733: \end{itemize} ! 734: ! 735: \begin{figure} ! 736: \begin{tabbing} ! 737: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\= ! 738: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\kill ! 739: prologue \> $\{$\\[4pt] ! 740: \#include\>``otherdefs.h''\\ ! 741: union node $\{$\\ ! 742: \> $...$ \>\>$/*$ definition of other node types $*/$\\ ! 743: \> union $\{$\\ ! 744: \>\> int operation;\\ ! 745: \>\> {\tt NODEPTR} left, right;\\ ! 746: \>\> {\tt LABEL} falselab, truelab;\>\>$/*$ {\tt LABEL} will defined ! 747: elsewhere $*/$\\ ! 748: \>\> $...$ \>\> $/*$ other definitions relevant to {\sl test nodes} $*/$\\ ! 749: \> $\}$ test; ! 750: \> $...$ \>\>$/*$ definition of other node types $*/$\\ ! 751: $\}$;\\[4pt] ! 752: $\}$;\\[6pt] ! 753: node\>nAnd nOr nNot nLess;\\ ! 754: label\>lTest lExpr;\\[6pt] ! 755: lTest:\>nAnd(lTest,lTest)\\ ! 756: \>$\{$ {\tt TOPDOWN}; $\}$\\ ! 757: \>=$\{$\\ ! 758: \>\>$\$1\$\to$test.falselab = $\$\$\to$test.falselab;\\ ! 759: \>\>$\$1\$\to$test.truelab = getlabel();\\ ! 760: \>\>tDO($\$\%1\$$);\\ ! 761: \>\>printlab($\$1\$\to$test.truelab);\\ ! 762: \>\>$\$2\$\to$test.falselab = $\$\$\to$test.falselab;\\ ! 763: \>\>$\$2\$\to$test.truelab = $\$\$\to$test.truelab;\\ ! 764: \>\>tDO($\$\%2\$$);\\ ! 765: \>$\}$;\\[4pt] ! 766: lTest:\>nOr(lTest,lTest)\\ ! 767: \>$\{$ {\tt TOPDOWN}; $\}$\\ ! 768: \>=$\{$\\ ! 769: \>\>$\$1\$\to$test.falselab = getlabel();\\ ! 770: \>\>$\$1\$\to$test.truelab = $\$\$\to$test.truelab;\\ ! 771: \>\>tDO($\$\%1\$$);\\ ! 772: \>\>$\$2\$\to$test.truelab = $\$\$\to$test.truelab;\\ ! 773: \>\>$\$2\$\to$test.falselab = $\$\$\to$test.falselab;\\ ! 774: \>\>tDO($\$\%2\$$);\\ ! 775: \>$\}$;\\ ! 776: \end{tabbing} ! 777: \caption{Short circuited Boolean Evaluation} ! 778: \label{short:circuit} ! 779: \end{figure} ! 780: \addtocounter{figure}{-1} ! 781: \begin{figure}[tp] ! 782: \begin{tabbing} ! 783: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\= ! 784: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\kill ! 785: lTest:\>nNot(lTest)\\ ! 786: \>$\{$ {\tt TOPDOWN}; $\}$\\ ! 787: \>=$\{$\\ ! 788: \>\>$\$1\$\to$test.truelab = $\$\$\to$test.falselab;\\ ! 789: \>\>$\$1\$\to$test.falselab = $\$\$\to$test.truelab;\\ ! 790: \>\>tDO($\$\%1\$$);\\ ! 791: \>$\}$;\\[4pt] ! 792: lTest:\>nLess(lExpr,lExpr)\>\>$/*$ {\sl not topdown anymore } $*/$\\ ! 793: \>=$\{$ $...$ {\sl generate code for comparison } $...$ $\}$;\\ ! 794: \end{tabbing} ! 795: \caption{(cont.) Short circuited Boolean Evaluation} ! 796: \end{figure} ! 797: ! 798: \subsection{Code Generation} ! 799: ! 800: Figure~\ref{example} is an ! 801: example of \twiglang{} program that ! 802: can be used to generate VAX code for ! 803: the subtract instruction: ! 804: ! 805: \begin{figure} ! 806: \begin{tabbing} ! 807: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\= ! 808: \hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\hspace{1in}\=\kill ! 809: prologue \> $\{$\\ ! 810: $/*$ otherdefs.h will have a type definition for {\tt NODEPTR} $*/$\\ ! 811: \#include\>``otherdefs.h''\\ ! 812: \#define\>{\tt TEMP\_COST}\>\>5\\ ! 813: \#define\>{\tt SUB\_COST}\>\>30\\ ! 814: \#define\>{\tt DEC\_COST}\>\>10\\ ! 815: NODEPTR \> gettemp();\\ ! 816: $\}$;\\[4pt] ! 817: node \> long constant sub;\\ ! 818: label \> operand temp;\\[4pt] ! 819: operand: \> long;\>\>\>$/*$ rule 1 $*/$\\ ! 820: operand:\>constant;\>\>\>$/*$ rule 2 $*/$\\ ! 821: operand:\>temp;\>\>\>$/*$ rule 3 $*/$\\[4pt] ! 822: temp:\>operand\>\>\>$/*$ rule 4 $*/$\\ ! 823: \>$\{$ cost = {\tt TEMP\_COST}+$\$\%1\$\to$cost;$\}$\\ ! 824: \>$=\{$\\ ! 825: \>\>NODEPTR t = gettemp();\\ ! 826: \>\> emit(``mov'', $\$\$$, t, 0);\\ ! 827: \>\>return(t);\\ ! 828: \>$\}$;\\[4pt] ! 829: operand:\>sub(operand,operand)\>\>\>$/*$ rule 5 $*/$\\ ! 830: \>$\{$cost = $\$\%1\$\to$cost + $\$\%2\$\to$cost+{\tt SUB\_COST};$\}$\\ ! 831: \>=$\{$\\ ! 832: \>\>NODEPTR t = gettemp();\\ ! 833: \>\>emit(``sub'', $\$1\$$, $\$2\$$, t, 0);\\ ! 834: \>\>return(t);\\ ! 835: \>$\}$;\\[4pt] ! 836: temp:\>sub(temp,constant)\>\>\>$/*$ rule 6 $*/$\\ ! 837: \>$\{$\\ ! 838: \>\>if(value($\$2\$$)==1)\\ ! 839: \>\>\>cost = $\$\%1\$\to$cost+{\tt DEC\_COST};\\ ! 840: \>\>else ABORT;\\ ! 841: \>$\}$=$\{$\\ ! 842: \>\>emit(``dec'', $\$1\$$, 0);\\ ! 843: \>\>return($\$1\$$);\\ ! 844: \>$\}$; ! 845: \end{tabbing} ! 846: \caption{Small Code Generation Example} ! 847: \label{example} ! 848: \end{figure} ! 849: ! 850: \begin{itemize} ! 851: \item Rules 3 and 4 form a loop. ! 852: The potential loop: temp$\to$operand$\to$temp$\to$operand$...$ is ! 853: broken ! 854: by the matcher recognizing that the cost of the second match of temp ! 855: does not cost less than the first match of temp. ! 856: ! 857: \item In the cost clause of rule 5, the cost is the sum of the ! 858: leaves plus the cost of the subtract instruction. ! 859: The action clause emits code to add the two operands ! 860: and leave the result in a temporary location. The temporary is ! 861: returned as a substitution for the subject tree. ! 862: ! 863: \item Rule 6 handles a special case where the left operand is ! 864: already in a temporary and the constant is one. In this case, the ! 865: temporary is directly decremented and returned as the new tree. ! 866: \end{itemize} ! 867: ! 868: \section{How to run \Twigcomp{}} ! 869: ! 870: Once a user has written the specification, \twigcomp{} is used to compile it ! 871: into a subroutine. ! 872: Figure~\ref{overview} gives an overview of how \twigcomp{} does this. ! 873: In Figure~\ref{overview}, rounded boxes indicate files and sharp ! 874: cornered boxes ! 875: are processing programs. The user creates the \twiglang{} specification which ! 876: is represented by the top box. The specification file must have the ! 877: suffix ``{\tt .mt}''. Let's say we have one called {\tt arbor.mt}. ! 878: To invoke the \twigcomp{} compiler we type: ! 879: ! 880: \begin{verse} ! 881: {\bf twig} [--{\bf w$xx$}] arbor.mt ! 882: \end{verse} ! 883: ! 884: The proprocessor reads {\tt arbor.mt} and if there are no errors ! 885: will create two files: {\tt symbols.h} and {\tt walker.c}. ! 886: To build {\tt walker.c}, \twigcomp{} uses a template file called {\tt walker.$xx$} ! 887: where $xx$ is the argument of the optional --{\bf w} flag. ! 888: If the flag is omitted then $xx$ defaults to ``{\tt c1}''. ! 889: ! 890: {\tt Symbols.h} and {\tt walker.c} ! 891: can then compiled and linked with the other modules of the ! 892: user's program. These other modules should provide the ! 893: Tree and Cost ADTs described above. ! 894: {\tt Walker.h} is an auxilliary include file that is referenced by { ! 895: \tt walker.c}. ! 896: A typical compile command is: ! 897: ! 898: \begin{verse} ! 899: {\bf cc} -- {\bf I\sl include\_dir} {\bf -c} walker.c ! 900: \end{verse} ! 901: ! 902: The --{\bf I\sl include\_dir} argument causes {\bf cc} to look for ! 903: walker.h in {\sl include\_dir}. The exact value of {\sl include\_dir} ! 904: will depend on where the files are stored at your site. ! 905: The tree matcher is initialized by calling {\tt \_matchinit} and matching ! 906: can begin by calling {\tt \_match} from ! 907: the user program. ! 908: ! 909: \section{Organization of the \Twigcomp{} Compiler} ! 910: The preprocessor is written completely in C and YACC. ! 911: It can be roughly broken up into four modules: ! 912: \begin{itemize} ! 913: \item Lexer --- The lexer performs lexical analysis. ! 914: The input is tokenized by this module and passed to the parser ! 915: as required. Identifiers are looked up in the symbol table ! 916: and space is reserved for them if they have never been encountered. ! 917: \item Parser --- This part is written in YACC and recognizes the ! 918: \twiglang{} language. When a language construct is recognized actions ! 919: are invoked. Declarations, prologues and inserts will invoke symbol ! 920: table operations when recognized. Rules are broken up and stored ! 921: in the symbol table too. The tree patterns in the rules are unflattened ! 922: and passed to the machine builder. ! 923: \item Machine Builder --- This module takes tree patterns and incrementally ! 924: builds an internal representation of the matching automaton. ! 925: The builder takes this ! 926: tree and then enumerates each of the possible paths from the root to the ! 927: leaves. These paths are treated as strings and a string matching automaton ! 928: is built as described in \cite{fgrep} and \cite{treematch}. ! 929: ! 930: Inside the builder the partially built machine is kept in a linked list ! 931: form. Each machine state is a linked list and each node represents ! 932: a state transition. When the tables for the machine are written out into { ! 933: \tt walker.c}, the linked list structure is ! 934: translated into a table of sixteen bit integers. ! 935: Each transistion is stored as two integers. The first is ! 936: the integer corresponding to the symbol that causes the transistion and the ! 937: second is the index of the next state in the table. ! 938: ! 939: \item Symbol Table --- The central data structure in this symbol table is ! 940: the hash table. ! 941: Each entry in the hash table is a bucket --- a linked list of symbol table ! 942: entries. There is no rehashing. All colliding items are placed ! 943: in the linked list. ! 944: This is a simple ! 945: and adequately efficient arrangement. Depending on the type of symbols, the ! 946: entry may point to other data structures. Entries for $label\_id$s point to ! 947: lists of trees. $Node\_id$ entries record the integers that have been ! 948: associates with them. Other entries may point to data structures holding a ! 949: textual representation of C code that forms action and cost parts. ! 950: \end{itemize} ! 951: ! 952: Here is a list of the files that form the \twigcomp{} compiler's source ! 953: and their approximate function. ! 954: \begin{itemize} ! 955: \item {\tt common.h} --- This file contains data definitions for how ! 956: trees are represented in the parser. It also contains type definitions for ! 957: external functions. ! 958: \item {\tt code.h} --- This file contains definitions for how code fragments ! 959: are stored. ! 960: \item {\tt sym.h} --- This file defines the major data structures in the ! 961: symbol table. ! 962: \item {\tt machine.h} --- This file defines the how the matching automaton ! 963: is represented internally. ! 964: \item {\tt twig.y} --- This is the parser. ! 965: \item {\tt sym.c} --- This is the symbol table manager. ! 966: \item {\tt path.c} --- The path string enumerator. ! 967: \item {\tt machine.c} --- The machine builder. ! 968: \item {\tt lex.c } --- The lexer. ! 969: \item {\tt tree.c}, {\tt io.c}, and {\tt code.c} --- Miscellaneous ! 970: routines for manipulating trees, performing I/O and manipulating ! 971: code fragments. ! 972: \end{itemize} ! 973: ! 974: \section{Performance} ! 975: ! 976: So far, the only experience we have with \twiglang{} ! 977: is a VAX code generator written ! 978: for the pcc2 compiler. The \twiglang{} code generator is 25\% faster than ! 979: the original code generator. ! 980: The maintainability and modifiability of the code generator has ! 981: improved. For example adding the indexed addressing mode ! 982: into the code generator took only a few hours. The target code quality ! 983: is just as good as the original code generator. ! 984: ! 985: The \twigcomp{} table ! 986: generation algorithm is fast. For the VAX machine description which ! 987: consist of 109 rules, the generation time was 4.2 seconds on a 780. ! 988: The generation time could be increased ! 989: by two orders of magnitude ! 990: before other table driven system can compete with \twigcomp{} on this basis. ! 991: Furthermore, the tables are small. The VAX description is only 7.5K ! 992: and the text space for the walker is 30K. ! 993: Again this is much smaller than those of other table driven systems. ! 994: ! 995: \Twiglang{} is currently a research tool. Several things can be done to improve ! 996: \twiglang{}'s performance. ! 997: \begin{itemize} ! 998: \item \Twiglang{} does a lot of procedure calls. Every machine transition ! 999: require at least one procedure call. Contrast this to YACC where a machine ! 1000: transition is done in line. ! 1001: \item \Twiglang{} performs a very expensive closure operation with respect to ! 1002: unit rules. A unit rule is the analogue of a unit production and has the ! 1003: form: ! 1004: ! 1005: \begin{verse} ! 1006: $label_1$: $label_2$; ! 1007: \end{verse} ! 1008: ! 1009: \noindent This closure is done at run time. ! 1010: It requires many procedure calls and manipulates complex data structures. ! 1011: We are looking at ways of doing this ! 1012: at table generation time or to hash the results of the closures so that ! 1013: redundant calculations can be avoided. The problem with doing them ! 1014: in the table generator is the possible explosion in its ! 1015: running time. ! 1016: ! 1017: \item Many of the cost parts for the rules are similar and hence some ! 1018: tests are performed and recalculated many times ! 1019: on a node. \Twigcomp{} should be clever ! 1020: enough to factor out some of these tests and do them only once. ! 1021: However, the information required for \twigcomp{} to do this is not easily ! 1022: available. It is hidden in the C code fragments. ! 1023: ! 1024: \item The current compact ! 1025: representation of the matching automaton ! 1026: require linear searching except at the ! 1027: start state where the large branching factor compelled us to use an ! 1028: array scheme. Using another representation would speed up the ! 1029: matcher. For example, implementing the VAX description as an array is ! 1030: estimated to use about 50K bytes but may provide ! 1031: performance improvement. ! 1032: ! 1033: \end{itemize} ! 1034: \bibliography{mybib} ! 1035: \bibliographystyle{plain} ! 1036: \end{document}
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.