Annotation of 43BSDReno/share/doc/ps1/15.yacc/ss9, revision 1.1.1.1

1.1       root        1: .\"    @(#)ss9 6.1 (Berkeley) 5/8/86
                      2: .\"
                      3: .SH
                      4: 9: Hints for Preparing Specifications
                      5: .PP
                      6: This section contains miscellaneous hints on preparing efficient, easy to change,
                      7: and clear specifications.
                      8: The individual subsections are more or less
                      9: independent.
                     10: .SH
                     11: Input Style
                     12: .PP
                     13: It is difficult to
                     14: provide rules with substantial actions
                     15: and still have a readable specification file.
                     16: The following style hints owe much to Brian Kernighan.
                     17: .IP a.
                     18: Use all capital letters for token names, all lower case letters for
                     19: nonterminal names.
                     20: This rule comes under the heading of ``knowing who to blame when
                     21: things go wrong.''
                     22: .IP b.
                     23: Put grammar rules and actions on separate lines.
                     24: This allows either to be changed without
                     25: an automatic need to change the other.
                     26: .IP c.
                     27: Put all rules with the same left hand side together.
                     28: Put the left hand side in only once, and let all
                     29: following rules begin with a vertical bar.
                     30: .IP d.
                     31: Put a semicolon only after the last rule with a given left hand side,
                     32: and put the semicolon on a separate line.
                     33: This allows new rules to be easily added.
                     34: .IP e.
                     35: Indent rule bodies by two tab stops, and action bodies by three
                     36: tab stops.
                     37: .PP
                     38: The example in Appendix A is written following this style, as are
                     39: the examples in the text of this paper (where space permits).
                     40: The user must make up his own mind about these stylistic questions;
                     41: the central problem, however, is to make the rules visible through
                     42: the morass of action code.
                     43: .SH
                     44: Left Recursion
                     45: .PP
                     46: The algorithm used by the Yacc parser encourages so called ``left recursive''
                     47: grammar rules: rules of the form
                     48: .DS
                     49: name   :       name  rest_of_rule  ;
                     50: .DE
                     51: These rules frequently arise when
                     52: writing specifications of sequences and lists:
                     53: .DS
                     54: list   :       item
                     55:        |       list  \',\'  item
                     56:        ;
                     57: .DE
                     58: and
                     59: .DS
                     60: seq    :       item
                     61:        |       seq  item
                     62:        ;
                     63: .DE
                     64: In each of these cases, the first rule
                     65: will be reduced for the first item only, and the second rule
                     66: will be reduced for the second and all succeeding items.
                     67: .PP
                     68: With right recursive rules, such as
                     69: .DS
                     70: seq    :       item
                     71:        |       item  seq
                     72:        ;
                     73: .DE
                     74: the parser would be a bit bigger, and the items would be seen, and reduced,
                     75: from right to left.
                     76: More seriously, an internal stack in the parser
                     77: would be in danger of overflowing if a very long sequence were read.
                     78: Thus, the user should use left recursion wherever reasonable.
                     79: .PP
                     80: It is worth considering whether a sequence with zero
                     81: elements has any meaning, and if so, consider writing
                     82: the sequence specification with an empty rule:
                     83: .DS
                     84: seq    :       /* empty */
                     85:        |       seq  item
                     86:        ;
                     87: .DE
                     88: Once again, the first rule would always be reduced exactly once, before the
                     89: first item was read,
                     90: and then the second rule would be reduced once for each item read.
                     91: Permitting empty sequences
                     92: often leads to increased generality.
                     93: However, conflicts might arise if Yacc is asked to decide
                     94: which empty sequence it has seen, when it hasn't seen enough to
                     95: know!
                     96: .SH
                     97: Lexical Tie-ins
                     98: .PP
                     99: Some lexical decisions depend on context.
                    100: For example, the lexical analyzer might want to
                    101: delete blanks normally, but not within quoted strings.
                    102: Or names might be entered into a symbol table in declarations,
                    103: but not in expressions.
                    104: .PP
                    105: One way of handling this situation is
                    106: to create a global flag that is
                    107: examined by the lexical analyzer, and set by actions.
                    108: For example, suppose a program
                    109: consists of 0 or more declarations, followed by 0 or more statements.
                    110: Consider:
                    111: .DS
                    112: %{
                    113:        int dflag;
                    114: %}
                    115:   ...  other declarations ...
                    116: 
                    117: %%
                    118: 
                    119: prog   :       decls  stats
                    120:        ;
                    121: 
                    122: decls  :       /* empty */
                    123:                        {       dflag = 1;  }
                    124:        |       decls  declaration
                    125:        ;
                    126: 
                    127: stats  :       /* empty */
                    128:                        {       dflag = 0;  }
                    129:        |       stats  statement
                    130:        ;
                    131: 
                    132:     ...  other rules ...
                    133: .DE
                    134: The flag
                    135: .I dflag
                    136: is now 0 when reading statements, and 1 when reading declarations,
                    137: .ul
                    138: except for the first token in the first statement.
                    139: This token must be seen by the parser before it can tell that
                    140: the declaration section has ended and the statements have
                    141: begun.
                    142: In many cases, this single token exception does not
                    143: affect the lexical scan.
                    144: .PP
                    145: This kind of ``backdoor'' approach can be elaborated
                    146: to a noxious degree.
                    147: Nevertheless, it represents a way of doing some things
                    148: that are difficult, if not impossible, to
                    149: do otherwise.
                    150: .SH
                    151: Reserved Words
                    152: .PP
                    153: Some programming languages
                    154: permit the user to
                    155: use words like ``if'', which are normally reserved,
                    156: as label or variable names, provided that such use does not
                    157: conflict with the legal use of these names in the programming language.
                    158: This is extremely hard to do in the framework of Yacc;
                    159: it is difficult to pass information to the lexical analyzer
                    160: telling it ``this instance of `if' is a keyword, and that instance is a variable''.
                    161: The user can make a stab at it, using the
                    162: mechanism described in the last subsection,
                    163: but it is difficult.
                    164: .PP
                    165: A number of ways of making this easier are under advisement.
                    166: Until then, it is better that the keywords be
                    167: .I reserved \|;
                    168: that is, be forbidden for use as variable names.
                    169: There are powerful stylistic reasons for preferring this, anyway.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.