|
|
1.1 ! root 1: \documentstyle[11pt,titlepage]{article} ! 2: % style ! 3: \setlength{\textwidth}{15cm} ! 4: \setlength{\oddsidemargin}{4.6mm} % 30 mm vasen marginaali ! 5: \setlength{\evensidemargin}{\oddsidemargin} ! 6: \setlength{\textheight}{22cm} ! 7: \setlength{\topmargin}{-8mm} ! 8: \newcommand{\mltwig}{ML-Twig } ! 9: ! 10: \title{Preliminary Version of \mltwig User Manual} ! 11: \author{Jussi Rintanen \\ Helsinki University of Technology} ! 12: \date{June 10, 1989} ! 13: ! 14: \begin{document} ! 15: \maketitle ! 16: \tableofcontents ! 17: \newpage ! 18: \section{Introduction} ! 19: ! 20: \mltwig is the Standard ML counterpart of Twig ! 21: \cite{ahoganapathitjiang,tjiang}, a general tree manipulation program ! 22: implemented in C by S. W. K. Tjiang, which can be used for various purposes, ! 23: especially for writing code generators \cite{appel}. ! 24: \mltwig is not a clone of Twig, but a similar tool adjusted to fit ! 25: the Standard ML environment. ! 26: The user writes a specification of a tree manipulation program, ! 27: which is then compiled by \mltwig to an ordinary Standard ML program. ! 28: ! 29: A specification defines a representation for the trees to be manipulated, ! 30: and a set of tree translation rules called a tree translation scheme. ! 31: The rules of the tree translation scheme can be seen as tree rewriting ! 32: rules, by means of which a tree can be reduced bottom-up to a single node. ! 33: Often, a tree translation scheme is redundant, and the reduction can ! 34: proceed using different rules. This redundancy can be taken ! 35: advantage of by associating a cost with each rule. The reduction chosen ! 36: by \mltwig is the cheapest. In the determination of the cheapest ! 37: reduction, a tree pattern matching algorithm and a dynamic programming ! 38: algorithm are used \cite{ahojohnson,hoffmannodonnell}. ! 39: ! 40: During the tree reduction process, the program produced by \mltwig maps ! 41: the user tree to a result value. The result value may be a new tree of ! 42: the same type as the original tree, or more generally, any Standard ML value. ! 43: A behaviour facilitating side-effects is also supported by means of ! 44: a top-down execution mechanism, which is similar to that of Twig. ! 45: The efficiency of \mltwig compiler is acceptable, but the run-time system ! 46: could be more efficient. Therefore we currently regard \mltwig as ! 47: an experimental tool. ! 48: ! 49: \section{\mltwig Specifications} ! 50: \subsection{Syntax} ! 51: ! 52: The syntax of \mltwig specifications is presented below. ! 53: ! 54: \newenvironment{grammar}{\goodbreak\medskip\obeylines\parskip=1pt}{\medskip\goodbreak} ! 55: \newcommand{\NT}[1]{ {\em #1} } ! 56: \newcommand{\TE}[1]{ {\tt #1} } ! 57: \newcommand{\RW}[1]{ {\tt\bf #1} } ! 58: \newcommand{\IS}{ $\Rightarrow$ } ! 59: \newcommand{\OR}{ $|$ } ! 60: \newcommand{\EMPTY}{ $\epsilon$ } ! 61: \newcommand{\TAB}{ \hskip8mm } ! 62: \newcommand{\VTAB}{ \vskip3mm } ! 63: \newcommand{\LPA}{ {\bf (} } ! 64: \newcommand{\RPA}{ {\bf )} } ! 65: \newcommand{\COM}{ {\bf ,} } ! 66: \newcommand{\RBR}{ ${\bf ]}$ } ! 67: \newcommand{\LBR}{ ${\bf [}$ } ! 68: \newcommand{\SEC}{ {\bf ;} } ! 69: \newcommand{\COL}{ {\bf :} } ! 70: \newcommand{\EQ} { {\bf =} } ! 71: ! 72: \begin{grammar} ! 73: \VTAB ! 74: \NT{specification}\IS\NT{head}\NT{body} ! 75: ! 76: \NT{head}\IS\NT{head-declaration}\OR\NT{head-declaration}\NT{head} ! 77: ! 78: \NT{head-declaration}\IS\RW{prologue}\LPA\NT{ml-code}\RPA\SEC ! 79: \TAB\OR\RW{default\_cost}\LPA \NT{ml-code}\RPA \SEC ! 80: \TAB\OR\RW{structure\_name}\TE{id} \SEC ! 81: \TAB\OR\RW{label}\NT{label-list}\SEC ! 82: \TAB\OR\RW{node}\NT{node-list}\SEC ! 83: ! 84: \NT{body}\IS\NT{bodyd}\OR\NT{bodyd}\NT{body} ! 85: ! 86: \NT{bodyd}\IS\NT{rule}\OR\RW{REWRITE}\NT{rule}\OR\RW{TOPDOWN}\NT{rule}\OR\RW{insert}\LPA\NT{ml-code}\RPA\SEC ! 87: ! 88: \VTAB ! 89: ! 90: \NT{node-list}\IS\NT{node}\OR\NT{node-list}\COM\NT{node} ! 91: ! 92: \NT{node}\IS\TE{nodeid}(\TE{int}) ! 93: ! 94: \NT{label-list}\IS\TE{label}\OR\NT{label-list}\TE{label} ! 95: ! 96: \NT{label}\IS\TE{labelid}\RW{of}\NT{ml-code} ! 97: ! 98: \VTAB ! 99: ! 100: \NT{rule}\IS\TE{labelid}\NT{tree-pattern}\NT{cost}\NT{action}\SEC ! 101: ! 102: \VTAB ! 103: ! 104: \NT{cost}\IS\COL(\NT{ml-code})\OR\EMPTY ! 105: ! 106: \NT{action}\IS\EQ(\NT{ml-code}) ! 107: ! 108: \NT{ml-code}\IS\TE{mltoken}\OR\NT{ml-code}\TE{mltoken} ! 109: ! 110: \NT{tree-pattern}\IS\TE{labelid}\OR\TE{nodeid}\OR\TE{nodeid}\LPA\NT{tree-list} \RPA ! 111: ! 112: \NT{tree-list}\IS\NT{tree-pattern}\OR\NT{tree-list}\COM\NT{tree-pattern} ! 113: ! 114: \VTAB ! 115: \end{grammar} ! 116: ! 117: The prologue, the node and label declarations, and the rules ! 118: are obligatory. The prologue must precede the inserts. ! 119: The node and label declarations must precede the rules. ! 120: ! 121: White space, new lines and comments can be used in all parts of a ! 122: specification. The lexical analyzer of \mltwig is basically the same ! 123: as that of Standard ML. Symbolic identifiers beginning with a \$ ! 124: should be used with caution because of presence of the special tree reference ! 125: tokens, see Section \ref{rules} for details. ! 126: The following words are reserved, and may not be used as label or node symbols. ! 127: ! 128: \begin{verbatim} ! 129: default_cost insert label node prologue ! 130: REWRITE TOPDOWN structure_name ! 131: \end{verbatim} ! 132: ! 133: \subsection{Label and Node Declarations} ! 134: ! 135: \mltwig sees the user tree abstractly as a simple tree, where each node ! 136: and leaf is annotated with a symbol. ! 137: The symbols are declared in the node declaration. ! 138: The declaration begins with the word {\tt node}, which is followed ! 139: by a comma-separated list of node symbols with the arity of the symbol ! 140: in parentheses. The arity of a leaf node is 0. ! 141: The user may not use reserved words of Standard ML as node symbols. ! 142: The type {\tt symbol} is generated by \mltwig from the {\tt node} declaration. ! 143: It contains nullary data constructors named after the node symbols. ! 144: E.g. for the node declaration ! 145: \begin{verbatim} ! 146: node PLUS(2), MINUS(2), MUL(2), DIV(2), CONST(0); ! 147: \end{verbatim} ! 148: the following nullary constructors are available for the user to be ! 149: used by the function {\tt node\_value}, see Section \ref{prologue}. ! 150: \begin{verbatim} ! 151: PLUS MINUS MUL DIV CONST ! 152: \end{verbatim} ! 153: ! 154: During the construction of the optimal cover, matched subtrees are labeled with ! 155: the replacement labels of the matching rules. The replacement ! 156: labels are declared in the label declaration. For each label, there ! 157: is an associated type. The declaration is of the form ! 158: \begin{verbatim} ! 159: label lab1 of ty1 | lab2 of ty2 | ... ; ! 160: \end{verbatim} ! 161: An example label declaration is ! 162: \begin{verbatim} ! 163: label Name of string | Const of int | Fb of foo -> bar list; ! 164: \end{verbatim} ! 165: The expression in the action part of a rule is of the same type as ! 166: the replacement label of the rule. The value of the expression may ! 167: be used in the rules matching at the parent nodes of the user tree. ! 168: ! 169: \subsection{Prologue}\label{prologue} ! 170: In the prologue part the user writes the values for interfacing the ! 171: user tree and cost representations to the \mltwig run time system: ! 172: \begin{verbatim} ! 173: type cost ! 174: val cost_less : cost * cost -> bool ! 175: ! 176: type tree ! 177: val get_subtrees : tree -> tree list ! 178: val node_value : tree -> symbol ! 179: \end{verbatim} ! 180: ! 181: The costs of the rules are represented by the type {\tt cost}. ! 182: The only operation for this type used by \mltwig is ! 183: the comparison function {\tt cost\_less}, which returns true, if ! 184: the first argument cost is smaller than the second. ! 185: ! 186: The type {\tt tree} is the type of the trees to be fed to the tree manipulator ! 187: program produced by \mltwig. The tree processor accesses the tree type ! 188: only with the functions {\tt get\_subtrees} and {\tt node\_value}. ! 189: {\tt Get\_subtrees} returns the list of subtrees of a tree. If there are ! 190: no subtrees, it returns the empty list. {\tt Node\_value} returns the ! 191: symbol associated with the root node of a tree. ! 192: ! 193: \subsection{Default Values} ! 194: A default cost may be defined to be used for the rules without an ! 195: explicit cost expression. ! 196: The syntax of the default cost declaration is ! 197: \begin{verbatim} ! 198: default_cost ( ... ); ! 199: \end{verbatim} ! 200: ! 201: The default cost is represented as a function of type {\tt cost list -> cost}, ! 202: where the argument {\tt cost list} contains the costs of the children. ! 203: The values defined in the prologue and the inserts may be used ! 204: in defining the default cost. ! 205: ! 206: The user may name the structure produced my \mltwig by providing ! 207: a structure name declaration. The syntax is simply ! 208: \begin{quote} ! 209: {\tt structure\_name} {\em id} {\tt ;} ! 210: \end{quote} ! 211: ! 212: \subsection{Inserts and Rules}\label{rules} ! 213: The body of an \mltwig specification consists of a set of rules, ! 214: possibly interleaved with inserts. An insert is a sequence of ! 215: Standard ML code in the same as the prologue. The syntax is similarly ! 216: \begin{verbatim} ! 217: insert ( ... ); ! 218: \end{verbatim} ! 219: There are three kinds of rules in \mltwig: ordinary, rewrite and ! 220: topdown. Rewrite rules are used for replacing a subtree ! 221: with another during the selection of an optimal cover. Topdown ! 222: rules are used in conjunction with side-effects, when it is ! 223: essential to execute the action parts of the parent and the children ! 224: in a certain order. ! 225: ! 226: The three kinds of rules correspond the different match modes of Twig. ! 227: Our scheme is more restricted, but we think it suits better ! 228: to the implementation and parent language Standard ML. ! 229: ! 230: The same syntax is used for all kinds of rules, except that rewrite and ! 231: topdown rules are prefixed with the identifiers {\tt REWRITE} and {\tt TOPDOWN}, ! 232: respectively. The syntax of rules is ! 233: \begin{verbatim} ! 234: replacement tree-pattern { : ( cost-expr ) } = ( action-expr ); ! 235: \end{verbatim} ! 236: ! 237: A rule matches at a node of the user tree, if the root symbol of the ! 238: tree pattern is the same as the symbol in the node of the user tree, ! 239: and the subtrees of the tree pattern match the subtrees of the user tree. ! 240: If the tree pattern is a label, it matches the user tree at a node, ! 241: if a rule with the same label as replacement has matched at the node. ! 242: ! 243: The cost expression is the price of matching a rule. The action expression ! 244: is primarily evaluated for a value, but it could also have side-effects. ! 245: A cover of a user tree is a set of rules, using which the tree can be ! 246: reduced to a single node. ! 247: The most inexpensive cover of a user tree is the cover with the least ! 248: cost rule matching at the root node. ! 249: ! 250: \mltwig selectes the most inexpensive cover, and executes the ! 251: actions associated with the rules constituting the cover. During ! 252: the selection phase, if the least cost rule matching at a node is ! 253: a rewrite rule, the action expression of the rule is evaluated for a new ! 254: tree to be used in place of the original subtree. ! 255: ! 256: \subsubsection{Tree Patterns} ! 257: ! 258: The tree patterns are in prefix notation, i.e. the root symbol ! 259: of a tree pattern is followed by parenthesized subpatterns. ! 260: A leaf node is not followed by parentheses. ! 261: An example of a tree pattern: ! 262: \begin{verbatim} ! 263: OP(PLUS,exp,OP(binop,exp,exp)) ! 264: \end{verbatim} ! 265: Obviously, the {\tt OP} symbol is a node symbol of arity 3. ! 266: The symbols {\tt PLUS}, {\tt binop} and {\tt exp} could be either ! 267: node symbols of arity 0 or labels, that depends on the node and ! 268: label declarations of the specification. ! 269: ! 270: \subsubsection{Costs} ! 271: The cost expression of a rule must be of type {\tt cost}. The costs ! 272: of the rules which matched at the nodes corresponding the labeled ! 273: leafs of the tree pattern may be accessed using identifiers named ! 274: after the following convention. If a label occurs only once ! 275: in the tree pattern, the identifier used to access the cost is the ! 276: label itself. If a label occurs two or more times, the different ! 277: occurences are indexed with a number. For example, in the tree ! 278: pattern (labels are in lower case) ! 279: \begin{verbatim} ! 280: MOVE(reg,OP(binop,exp,exp)) ! 281: \end{verbatim} ! 282: the cost of the rule matching at the node corresponding the {\tt reg} ! 283: node could be accessed with the identifier {\tt reg}, and the rest of ! 284: the labels with the identifiers {\tt binop}, {\tt exp1} and {\tt exp2}, ! 285: in that order. If a name is ambiguous, the user is responsible to rename ! 286: the labels. An example of an ambiguity can be seen in the following ! 287: tree pattern. ! 288: \begin{verbatim} ! 289: N(a,a1,a) ! 290: \end{verbatim} ! 291: The names of the labels would be {\tt a1}, {\tt a1} and {\tt a2}. ! 292: ! 293: All values defined in the prologue and the inserts can also be used. ! 294: A subtree of the current node can be accessed by a special token ! 295: by writing a period-separated list of subtree numbers between a ! 296: pair of \$'s. For example, the third subtree of the second subtree ! 297: of the current node can be accessed using \$2.3\$. As a special case, ! 298: the current node can be accessed using \$\$. The default cost, ! 299: as specified in the default cost declaration, can be referenced ! 300: with the identifier DC. A rule can be rejected by calling ! 301: the function {\tt ABORT} with the argument {\tt ()}. ! 302: ! 303: \subsubsection{Actions} ! 304: There are some values available for all three kinds of rules: ! 305: the values defined in the prologue and in the inserts, ! 306: as well as the tree reference tokens described in the previous section. ! 307: ! 308: The ordinary and topdown rules return a value of the same type as the ! 309: replacement label of the rule. Rewrite rules return a value of type {\tt tree}. ! 310: ! 311: In ordinary rules, the values of the action expressions of ! 312: the rules matching at the nodes corresponding the labeled leafs of ! 313: the tree pattern can be accessed in the same way as costs in the ! 314: cost expression. The types of the values are the types of the labels ! 315: as described in the label declaration. ! 316: ! 317: When evaluating the action expression of a topdown rule, the action ! 318: expression of the children rules have not yet been evaluated. Instead, ! 319: there are available funtions for evaluating the children expressions, ! 320: which the user can apply in any order she wishes. The functions are ! 321: named the same in way as the cost expressions of the children and the ! 322: action expression of the children in ordinary rules, except that they ! 323: are prefixed with the string ''{\tt DO}''. For example, if the ! 324: label {\tt stm} is of type {\tt string list}, then in the action ! 325: expression of a rule with the tree pattern ! 326: \begin{verbatim} ! 327: SEQ(stm,stm) ! 328: \end{verbatim} ! 329: the following functions can be used: ! 330: \begin{verbatim} ! 331: DOstm1 : unit -> string list ! 332: DOstm2 : unit -> string list ! 333: \end{verbatim} ! 334: For rewrite rules, only the common values are available. ! 335: ! 336: \subsubsection{Unit Rules} ! 337: An \mltwig specification often contains rules with a tree pattern ! 338: consisting of just one label, i.e. unit rules. The problem with ! 339: unit rules is that they can form cycles. For example, if rules {\em a b}, ! 340: {\em b c} and {\em c a} were present, and a rule with the replacement ! 341: {\em b} had matched, the tree pattern matcher would match the first ! 342: rule, then the third, the second, the first and so on. ! 343: ! 344: In \mltwig, the unit rule cycles are cut, when the initial label is ! 345: encountered the second time. Another possibility would be to cut, when ! 346: the first matched unit rule is encountered the second time. ! 347: ! 348: \section{How to Run \mltwig and a Program Produced by It} ! 349: The current implementation of \mltwig has been written with ! 350: Standard ML of New Jersey 0.33 in mind. However, it should ! 351: be fairly easy to port it to other implementations of Standard ML. ! 352: ! 353: There are two ways to use \mltwig. The first way is to compile it ! 354: with the batch compiler, and use it ! 355: independently of the interactive system. For this purpose, there is ! 356: a simple ''user interface'', which prints a title and asks the user ! 357: for the name of the specification file. ! 358: The second way is to load \mltwig into the interactive system, and call ! 359: the function {\tt Main.main} with a file name as the argument. ! 360: ! 361: \mltwig reads the specification file, and produces a new file containing ! 362: the tree processor program. The name of the new file is the name of ! 363: the specification file suffixed with ''.sml''. ! 364: The resulting file contains a structure of the form ! 365: \begin{verbatim} ! 366: structure NNN = ! 367: struct ! 368: structure User = ! 369: struct ! 370: ... ! 371: end ! 372: exception NoCover ! 373: exception InternalError of string ! 374: val translate : User.tree -> User.result = ... ! 375: end; ! 376: \end{verbatim} ! 377: The name of the structure ({\tt NNN}) is the name specified in the ! 378: structure\_name declaration, or the default name {\tt TreeProcessor}. ! 379: The substructure {\tt User} contains all the code of the prologue and ! 380: the inserts, and the definition of the type {\tt symbol}. ! 381: The function {\tt translate} is the actual function, which ! 382: is applied to a user tree. If no cover was found, the exception ! 383: {\tt NoCover} is raised. If an internal error occurs in the function ! 384: {\tt translate}, the exception {\tt InternalError} is raised. ! 385: The value returned by {\tt translate} is the value returned by ! 386: the rule matched at the root node of the user tree. It is tagged ! 387: with a data constructor, which is the replacement label of the ! 388: rule at the root node of the least cost cover prefixed with ''\_''. ! 389: The structure described above is not self-contained, and a functor ! 390: located in a separate file must be loaded before it. ! 391: ! 392: %\section{A Short Implementation Description} ! 393: \section{An Example \mltwig Specification} ! 394: ! 395: Our example is a specification of a simple ''code generator'', which ! 396: compiles arithmetic expressions to instructions for a simple stack machine. ! 397: The instruction set of the machine is redundant: in addition to the simple ! 398: arithmetic instructions, it has a combined addition--multiplication ! 399: instruction, which is cheaper than the separate instructions ! 400: performing the same calculation. ! 401: \newpage ! 402: \begin{verbatim} ! 403: (* Example of a complete ML-Twig specification: *) ! 404: (* Evaluation of expressions *) ! 405: ! 406: node Plus(2), Minus(2), Mul(2); ! 407: node Const(0); ! 408: ! 409: prologue ( ! 410: ! 411: (* The type and function definitions *) ! 412: ! 413: datatype tree = Tree of (tree * symbol * tree) | Leaf of int ! 414: type cost = int ! 415: ! 416: fun get_subtrees(Leaf _) = [] ! 417: | get_subtrees(Tree (t1,_,t2)) = [t1,t2] ! 418: fun node_value(Tree(_,ope,_)) = ope ! 419: | node_value (Leaf _) = Const ! 420: val cost_less : int * int -> bool = (op <) ! 421: ! 422: fun constValue (Leaf i) = i ! 423: datatype instr = PUSH of int | PLUS | MINUS | MUL | PLUSMUL ! 424: ); ! 425: ! 426: label Expr of instr list; ! 427: ! 428: default_cost( fn subexprcost => fold (op +) subexprcost 0); ! 429: ! 430: (* Rules *) ! 431: ! 432: Expr Const =([PUSH (constValue $$)]); ! 433: ! 434: Expr Plus(Expr,Expr) : (2+Expr1+Expr2) =(Expr1@Expr2@[PLUS]); ! 435: ! 436: Expr Minus(Expr,Expr) : (2+Expr1+Expr2) =(Expr1@Expr2@[MINUS]); ! 437: ! 438: Expr Mul(Expr,Expr) : (2+DC) =(Expr1@Expr2@[MUL]); ! 439: ! 440: Expr Mul(Expr,Plus(Expr,Expr)) : (3+DC) =(Expr1@Expr2@Expr3@[PLUSMUL]); ! 441: ! 442: REWRITE Expr Mul(Plus(Expr,Expr),Expr) : (0) =(Tree($2$,Mul,$1$)); ! 443: ! 444: (* End *) ! 445: \end{verbatim} ! 446: \newpage ! 447: \begin{thebibliography}{99} ! 448: ! 449: \bibitem{ahocorasick} {\sc A. V. Aho and M. J. Corasick}, ! 450: {\em Efficient String Pattern Matching: An Aid to Bibliographic Search}. ! 451: Communications of The ACM, 18(June 1975), pp. 333-340. ! 452: ! 453: \bibitem{ahoganapathitjiang} {\sc A. V. Aho, M. Ganapathi and S. W. K. ! 454: Tjiang}, {\em Code Generation Using Tree Matching and Dynamic Programming}, ! 455: AT\&T Bell Laboratories, 600 Mountain Avenue, NJ 07974, 1986. ! 456: ! 457: \bibitem{ahojohnson} {\sc A. V. Aho and S. C. Johnson}, {\em Optimal Code ! 458: Generation for Expression Trees}, Journal of The ACM, 23(July 1975), ! 459: pp. 488-501. ! 460: ! 461: \bibitem{appel} {\sc A. W. Appel}, {\em Concise Specifications of Locally ! 462: Optimal Code Generators}, Technical Report CS-TR-080-87, Princeton ! 463: University, Department of Computer Science, February 1987. ! 464: ! 465: \bibitem{harpermacqueenmilner} {\sc R. Harper, D. MacQueen and R. Milner}, ! 466: {\em Standard ML}, Laboratory of Foundations of Computer Science, ! 467: University of Edinburgh, Technical Report ECS-LFCS-82-2 (1986). ! 468: ! 469: \bibitem{hoffmannodonnell} {\sc C. M. Hoffmann and M. J. O'Donnell}, ! 470: {\em Pattern Matching in Trees}, Journal of The ACM, 29(January 1982), ! 471: pp. 68-95. ! 472: ! 473: \bibitem{tjiang} {\sc Tjiang, S. W. K.}, {\em Twig Reference Manual}, ! 474: Technical Report, AT\&T Bell Laboratories, 1986. ! 475: ! 476: \end{thebibliography} ! 477: ! 478: \end{document}
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.