|
|
1.1 ! root 1: .\" @(#)ss1 6.1 (Berkeley) 5/8/86 ! 2: .\" ! 3: .tr *\(** ! 4: .tr |\(or ! 5: .SH ! 6: 1: Basic Specifications ! 7: .PP ! 8: Names refer to either tokens or nonterminal symbols. ! 9: Yacc requires ! 10: token names to be declared as such. ! 11: In addition, for reasons discussed in Section 3, it is often desirable ! 12: to include the lexical analyzer as part of the specification file; ! 13: it may be useful to include other programs as well. ! 14: Thus, every specification file consists of three sections: ! 15: the ! 16: .I declarations , ! 17: .I "(grammar) rules" , ! 18: and ! 19: .I programs . ! 20: The sections are separated by double percent ``%%'' marks. ! 21: (The percent ``%'' is generally used in Yacc specifications as an escape character.) ! 22: .PP ! 23: In other words, a full specification file looks like ! 24: .DS ! 25: declarations ! 26: %% ! 27: rules ! 28: %% ! 29: programs ! 30: .DE ! 31: .PP ! 32: The declaration section may be empty. ! 33: Moreover, if the programs section is omitted, the second %% mark may be omitted also; ! 34: thus, the smallest legal Yacc specification is ! 35: .DS ! 36: %% ! 37: rules ! 38: .DE ! 39: .PP ! 40: Blanks, tabs, and newlines are ignored except ! 41: that they may not appear in names or multi-character reserved symbols. ! 42: Comments may appear wherever a name is legal; they are enclosed ! 43: in /* . . . */, as in C and PL/I. ! 44: .PP ! 45: The rules section is made up of one or more grammar rules. ! 46: A grammar rule has the form: ! 47: .DS ! 48: A : BODY ; ! 49: .DE ! 50: A represents a nonterminal name, and BODY represents a sequence of zero or more names and literals. ! 51: The colon and the semicolon are Yacc punctuation. ! 52: .PP ! 53: Names may be of arbitrary length, and may be made up of letters, dot ``.'', underscore ``\_'', and ! 54: non-initial digits. ! 55: Upper and lower case letters are distinct. ! 56: The names used in the body of a grammar rule may represent tokens or nonterminal symbols. ! 57: .PP ! 58: A literal consists of a character enclosed in single quotes ``\'''. ! 59: As in C, the backslash ``\e'' is an escape character within literals, and all the C escapes ! 60: are recognized. ! 61: Thus ! 62: .DS ! 63: \'\en\' newline ! 64: \'\er\' return ! 65: \'\e\'\' single quote ``\''' ! 66: \'\e\e\' backslash ``\e'' ! 67: \'\et\' tab ! 68: \'\eb\' backspace ! 69: \'\ef\' form feed ! 70: \'\exxx\' ``xxx'' in octal ! 71: .DE ! 72: For a number of technical reasons, the ! 73: \s-2NUL\s0 ! 74: character (\'\e0\' or 0) should never ! 75: be used in grammar rules. ! 76: .PP ! 77: If there are several grammar rules with the same left hand side, the vertical bar ``|'' ! 78: can be used to avoid rewriting the left hand side. ! 79: In addition, ! 80: the semicolon at the end of a rule can be dropped before a vertical bar. ! 81: Thus the grammar rules ! 82: .DS ! 83: A : B C D ; ! 84: A : E F ; ! 85: A : G ; ! 86: .DE ! 87: can be given to Yacc as ! 88: .DS ! 89: A : B C D ! 90: | E F ! 91: | G ! 92: ; ! 93: .DE ! 94: It is not necessary that all grammar rules with the same left side appear together in the grammar rules section, ! 95: although it makes the input much more readable, and easier to change. ! 96: .PP ! 97: If a nonterminal symbol matches the empty string, this can be indicated in the obvious way: ! 98: .DS ! 99: empty : ; ! 100: .DE ! 101: .PP ! 102: Names representing tokens must be declared; this is most simply done by writing ! 103: .DS ! 104: %token name1 name2 . . . ! 105: .DE ! 106: in the declarations section. ! 107: (See Sections 3 , 5, and 6 for much more discussion). ! 108: Every name not defined in the declarations section is assumed to represent a nonterminal symbol. ! 109: Every nonterminal symbol must appear on the left side of at least one rule. ! 110: .PP ! 111: Of all the nonterminal symbols, one, called the ! 112: .I "start symbol" , ! 113: has particular importance. ! 114: The parser is designed to recognize the start symbol; thus, ! 115: this symbol represents the largest, ! 116: most general structure described by the grammar rules. ! 117: By default, ! 118: the start symbol is taken to be the left hand side of the first ! 119: grammar rule in the rules section. ! 120: It is possible, and in fact desirable, to declare the start ! 121: symbol explicitly in the declarations section using the %start keyword: ! 122: .DS ! 123: %start symbol ! 124: .DE ! 125: .PP ! 126: The end of the input to the parser is signaled by a special token, called the ! 127: .I endmarker . ! 128: If the tokens up to, but not including, the endmarker form a structure ! 129: which matches the start symbol, the parser function returns to its caller ! 130: after the endmarker is seen; it ! 131: .I accepts ! 132: the input. ! 133: If the endmarker is seen in any other context, it is an error. ! 134: .PP ! 135: It is the job of the user-supplied lexical analyzer ! 136: to return the endmarker when appropriate; see section 3, below. ! 137: Usually the endmarker represents some reasonably obvious ! 138: I/O status, such as ``end-of-file'' or ``end-of-record''.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.