|
|
1.1 ! root 1: .\" @(#)ss9 6.1 (Berkeley) 5/8/86 ! 2: .\" ! 3: .SH ! 4: 9: Hints for Preparing Specifications ! 5: .PP ! 6: This section contains miscellaneous hints on preparing efficient, easy to change, ! 7: and clear specifications. ! 8: The individual subsections are more or less ! 9: independent. ! 10: .SH ! 11: Input Style ! 12: .PP ! 13: It is difficult to ! 14: provide rules with substantial actions ! 15: and still have a readable specification file. ! 16: The following style hints owe much to Brian Kernighan. ! 17: .IP a. ! 18: Use all capital letters for token names, all lower case letters for ! 19: nonterminal names. ! 20: This rule comes under the heading of ``knowing who to blame when ! 21: things go wrong.'' ! 22: .IP b. ! 23: Put grammar rules and actions on separate lines. ! 24: This allows either to be changed without ! 25: an automatic need to change the other. ! 26: .IP c. ! 27: Put all rules with the same left hand side together. ! 28: Put the left hand side in only once, and let all ! 29: following rules begin with a vertical bar. ! 30: .IP d. ! 31: Put a semicolon only after the last rule with a given left hand side, ! 32: and put the semicolon on a separate line. ! 33: This allows new rules to be easily added. ! 34: .IP e. ! 35: Indent rule bodies by two tab stops, and action bodies by three ! 36: tab stops. ! 37: .PP ! 38: The example in Appendix A is written following this style, as are ! 39: the examples in the text of this paper (where space permits). ! 40: The user must make up his own mind about these stylistic questions; ! 41: the central problem, however, is to make the rules visible through ! 42: the morass of action code. ! 43: .SH ! 44: Left Recursion ! 45: .PP ! 46: The algorithm used by the Yacc parser encourages so called ``left recursive'' ! 47: grammar rules: rules of the form ! 48: .DS ! 49: name : name rest_of_rule ; ! 50: .DE ! 51: These rules frequently arise when ! 52: writing specifications of sequences and lists: ! 53: .DS ! 54: list : item ! 55: | list \',\' item ! 56: ; ! 57: .DE ! 58: and ! 59: .DS ! 60: seq : item ! 61: | seq item ! 62: ; ! 63: .DE ! 64: In each of these cases, the first rule ! 65: will be reduced for the first item only, and the second rule ! 66: will be reduced for the second and all succeeding items. ! 67: .PP ! 68: With right recursive rules, such as ! 69: .DS ! 70: seq : item ! 71: | item seq ! 72: ; ! 73: .DE ! 74: the parser would be a bit bigger, and the items would be seen, and reduced, ! 75: from right to left. ! 76: More seriously, an internal stack in the parser ! 77: would be in danger of overflowing if a very long sequence were read. ! 78: Thus, the user should use left recursion wherever reasonable. ! 79: .PP ! 80: It is worth considering whether a sequence with zero ! 81: elements has any meaning, and if so, consider writing ! 82: the sequence specification with an empty rule: ! 83: .DS ! 84: seq : /* empty */ ! 85: | seq item ! 86: ; ! 87: .DE ! 88: Once again, the first rule would always be reduced exactly once, before the ! 89: first item was read, ! 90: and then the second rule would be reduced once for each item read. ! 91: Permitting empty sequences ! 92: often leads to increased generality. ! 93: However, conflicts might arise if Yacc is asked to decide ! 94: which empty sequence it has seen, when it hasn't seen enough to ! 95: know! ! 96: .SH ! 97: Lexical Tie-ins ! 98: .PP ! 99: Some lexical decisions depend on context. ! 100: For example, the lexical analyzer might want to ! 101: delete blanks normally, but not within quoted strings. ! 102: Or names might be entered into a symbol table in declarations, ! 103: but not in expressions. ! 104: .PP ! 105: One way of handling this situation is ! 106: to create a global flag that is ! 107: examined by the lexical analyzer, and set by actions. ! 108: For example, suppose a program ! 109: consists of 0 or more declarations, followed by 0 or more statements. ! 110: Consider: ! 111: .DS ! 112: %{ ! 113: int dflag; ! 114: %} ! 115: ... other declarations ... ! 116: ! 117: %% ! 118: ! 119: prog : decls stats ! 120: ; ! 121: ! 122: decls : /* empty */ ! 123: { dflag = 1; } ! 124: | decls declaration ! 125: ; ! 126: ! 127: stats : /* empty */ ! 128: { dflag = 0; } ! 129: | stats statement ! 130: ; ! 131: ! 132: ... other rules ... ! 133: .DE ! 134: The flag ! 135: .I dflag ! 136: is now 0 when reading statements, and 1 when reading declarations, ! 137: .ul ! 138: except for the first token in the first statement. ! 139: This token must be seen by the parser before it can tell that ! 140: the declaration section has ended and the statements have ! 141: begun. ! 142: In many cases, this single token exception does not ! 143: affect the lexical scan. ! 144: .PP ! 145: This kind of ``backdoor'' approach can be elaborated ! 146: to a noxious degree. ! 147: Nevertheless, it represents a way of doing some things ! 148: that are difficult, if not impossible, to ! 149: do otherwise. ! 150: .SH ! 151: Reserved Words ! 152: .PP ! 153: Some programming languages ! 154: permit the user to ! 155: use words like ``if'', which are normally reserved, ! 156: as label or variable names, provided that such use does not ! 157: conflict with the legal use of these names in the programming language. ! 158: This is extremely hard to do in the framework of Yacc; ! 159: it is difficult to pass information to the lexical analyzer ! 160: telling it ``this instance of `if' is a keyword, and that instance is a variable''. ! 161: The user can make a stab at it, using the ! 162: mechanism described in the last subsection, ! 163: but it is difficult. ! 164: .PP ! 165: A number of ways of making this easier are under advisement. ! 166: Until then, it is better that the keywords be ! 167: .I reserved \|; ! 168: that is, be forbidden for use as variable names. ! 169: There are powerful stylistic reasons for preferring this, anyway.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.