Annotation of 43BSDReno/share/doc/ps1/15.yacc/ss5, revision 1.1

1.1     ! root        1: .\"    @(#)ss5 6.1 (Berkeley) 5/8/86
        !             2: .\"
        !             3: .SH
        !             4: 5: Ambiguity and Conflicts
        !             5: .PP
        !             6: A set of grammar rules is
        !             7: .I ambiguous
        !             8: if there is some input string that can be structured in two or more different ways.
        !             9: For example, the grammar rule
        !            10: .DS
        !            11: expr   :       expr  \'\-\'  expr
        !            12: .DE
        !            13: is a natural way of expressing the fact that one way of forming an arithmetic expression
        !            14: is to put two other expressions together with a minus sign between them.
        !            15: Unfortunately, this grammar rule does not
        !            16: completely specify the way that all complex inputs
        !            17: should be structured.
        !            18: For example, if the input is
        !            19: .DS
        !            20: expr  \-  expr  \-  expr
        !            21: .DE
        !            22: the rule allows this input to be structured as either
        !            23: .DS
        !            24: (  expr  \-  expr  )  \-  expr
        !            25: .DE
        !            26: or as
        !            27: .DS
        !            28: expr  \-  (  expr  \-  expr  )
        !            29: .DE
        !            30: (The first is called
        !            31: .I "left association" ,
        !            32: the second
        !            33: .I "right association" ).
        !            34: .PP
        !            35: Yacc detects such ambiguities when it is attempting to build the parser.
        !            36: It is instructive to consider the problem that confronts the parser when it is
        !            37: given an input such as
        !            38: .DS
        !            39: expr  \-  expr  \-  expr
        !            40: .DE
        !            41: When the parser has read the second expr, the input that it has seen:
        !            42: .DS
        !            43: expr  \-  expr
        !            44: .DE
        !            45: matches the right side of the grammar rule above.
        !            46: The parser could
        !            47: .I reduce
        !            48: the input by applying this rule;
        !            49: after applying the rule;
        !            50: the input is reduced to
        !            51: .I expr (the
        !            52: left side of the rule).
        !            53: The parser would then read the final part of the input:
        !            54: .DS
        !            55: \-  expr
        !            56: .DE
        !            57: and again reduce.
        !            58: The effect of this is to take the left associative interpretation.
        !            59: .PP
        !            60: Alternatively, when the parser has seen
        !            61: .DS
        !            62: expr  \-  expr
        !            63: .DE
        !            64: it could defer the immediate application of the rule, and continue reading
        !            65: the input until it had seen
        !            66: .DS
        !            67: expr  \-  expr  \-  expr
        !            68: .DE
        !            69: It could then apply the rule to the rightmost three symbols, reducing them to
        !            70: .I expr
        !            71: and leaving
        !            72: .DS
        !            73: expr  \-  expr
        !            74: .DE
        !            75: Now the rule can be reduced once more; the effect is to
        !            76: take the right associative interpretation.
        !            77: Thus, having read
        !            78: .DS
        !            79: expr  \-  expr
        !            80: .DE
        !            81: the parser can do two legal things, a shift or a reduction, and has no way of
        !            82: deciding between them.
        !            83: This is called a
        !            84: .I "shift / reduce conflict" .
        !            85: It may also happen that the parser has a choice of two legal reductions;
        !            86: this is called a
        !            87: .I "reduce / reduce conflict" .
        !            88: Note that there are never any ``Shift/shift'' conflicts.
        !            89: .PP
        !            90: When there are shift/reduce or reduce/reduce conflicts, Yacc still produces a parser.
        !            91: It does this by selecting one of the valid steps wherever it has a choice.
        !            92: A rule describing which choice to make in a given situation is called
        !            93: a
        !            94: .I "disambiguating rule" .
        !            95: .PP
        !            96: Yacc invokes two disambiguating rules by default:
        !            97: .IP 1.
        !            98: In a shift/reduce conflict, the default is to do the shift.
        !            99: .IP 2.
        !           100: In a reduce/reduce conflict, the default is to reduce by the
        !           101: .I earlier
        !           102: grammar rule (in the input sequence).
        !           103: .PP
        !           104: Rule 1 implies that reductions are deferred whenever there is a choice,
        !           105: in favor of shifts.
        !           106: Rule 2 gives the user rather crude control over the behavior of the parser
        !           107: in this situation, but reduce/reduce conflicts should be avoided whenever possible.
        !           108: .PP
        !           109: Conflicts may arise because of mistakes in input or logic, or because the grammar rules, while consistent,
        !           110: require a more complex parser than Yacc can construct.
        !           111: The use of actions within rules can also cause conflicts, if the action must
        !           112: be done before the parser can be sure which rule is being recognized.
        !           113: In these cases, the application of disambiguating rules is inappropriate,
        !           114: and leads to an incorrect parser.
        !           115: For this reason, Yacc
        !           116: always reports the number of shift/reduce and reduce/reduce conflicts resolved by Rule 1 and Rule 2.
        !           117: .PP
        !           118: In general, whenever it is possible to apply disambiguating rules to produce a correct parser, it is also
        !           119: possible to rewrite the grammar rules so that the same inputs are read but there are no
        !           120: conflicts.
        !           121: For this reason, most previous parser generators
        !           122: have considered conflicts to be fatal errors.
        !           123: Our experience has suggested that this rewriting is somewhat unnatural,
        !           124: and produces slower parsers; thus, Yacc will produce parsers even in the presence of conflicts.
        !           125: .PP
        !           126: As an example of the power of disambiguating rules, consider a fragment from a programming
        !           127: language involving an ``if-then-else'' construction:
        !           128: .DS
        !           129: stat   :       IF  \'(\'  cond  \')\'  stat
        !           130:        |       IF  \'(\'  cond  \')\'  stat  ELSE  stat
        !           131:        ;
        !           132: .DE
        !           133: In these rules,
        !           134: .I IF
        !           135: and
        !           136: .I ELSE
        !           137: are tokens,
        !           138: .I cond
        !           139: is a nonterminal symbol describing
        !           140: conditional (logical) expressions, and
        !           141: .I stat
        !           142: is a nonterminal symbol describing statements.
        !           143: The first rule will be called the
        !           144: .ul
        !           145: simple-if
        !           146: rule, and the
        !           147: second the
        !           148: .ul
        !           149: if-else
        !           150: rule.
        !           151: .PP
        !           152: These two rules form an ambiguous construction, since input of the form
        !           153: .DS
        !           154: IF  (  C1  )  IF  (  C2  )  S1  ELSE  S2
        !           155: .DE
        !           156: can be structured according to these rules in two ways:
        !           157: .DS
        !           158: IF  (  C1  )  {
        !           159:        IF  (  C2  )  S1
        !           160:        }
        !           161: ELSE  S2
        !           162: .DE
        !           163: or
        !           164: .DS
        !           165: IF  (  C1  )  {
        !           166:        IF  (  C2  )  S1
        !           167:        ELSE  S2
        !           168:        }
        !           169: .DE
        !           170: The second interpretation is the one given in most programming languages
        !           171: having this construct.
        !           172: Each
        !           173: .I ELSE
        !           174: is associated with the last preceding
        !           175: ``un-\fIELSE'\fRd''
        !           176: .I IF .
        !           177: In this example, consider the situation where the parser has seen
        !           178: .DS
        !           179: IF  (  C1  )  IF  (  C2  )  S1
        !           180: .DE
        !           181: and is looking at the
        !           182: .I ELSE .
        !           183: It can immediately
        !           184: reduce
        !           185: by the simple-if rule to get
        !           186: .DS
        !           187: IF  (  C1  )  stat
        !           188: .DE
        !           189: and then read the remaining input,
        !           190: .DS
        !           191: ELSE  S2
        !           192: .DE
        !           193: and reduce
        !           194: .DS
        !           195: IF  (  C1  )  stat  ELSE  S2
        !           196: .DE
        !           197: by the if-else rule.
        !           198: This leads to the first of the above groupings of the input.
        !           199: .PP
        !           200: On the other hand, the
        !           201: .I ELSE
        !           202: may be shifted,
        !           203: .I S2
        !           204: read, and then the right hand portion of
        !           205: .DS
        !           206: IF  (  C1  )  IF  (  C2  )  S1  ELSE  S2
        !           207: .DE
        !           208: can be reduced by the if-else rule to get
        !           209: .DS
        !           210: IF  (  C1  )  stat
        !           211: .DE
        !           212: which can be reduced by the simple-if rule.
        !           213: This leads to the second of the above groupings of the input, which
        !           214: is usually desired.
        !           215: .PP
        !           216: Once again the parser can do two valid things \- there is a shift/reduce conflict.
        !           217: The application of disambiguating rule 1 tells the parser to shift in this case,
        !           218: which leads to the desired grouping.
        !           219: .PP
        !           220: This shift/reduce conflict arises only when there is a particular current input symbol,
        !           221: .I ELSE ,
        !           222: and particular inputs already seen, such as
        !           223: .DS
        !           224: IF  (  C1  )  IF  (  C2  )  S1
        !           225: .DE
        !           226: In general, there may be many conflicts, and each one
        !           227: will be associated with an input symbol and
        !           228: a set of previously read inputs.
        !           229: The previously read inputs are characterized by the
        !           230: state
        !           231: of the parser.
        !           232: .PP
        !           233: The conflict messages of Yacc are best understood
        !           234: by examining the verbose (\fB\-v\fR) option output file.
        !           235: For example, the output corresponding to the above
        !           236: conflict state might be:
        !           237: .DS L
        !           238: 23: shift/reduce conflict (shift 45, reduce 18) on ELSE
        !           239: 
        !           240: state 23
        !           241: 
        !           242:          stat  :  IF  (  cond  )  stat\_         (18)
        !           243:          stat  :  IF  (  cond  )  stat\_ELSE  stat
        !           244: 
        !           245:         ELSE     shift 45
        !           246:         .        reduce 18
        !           247: 
        !           248: .DE
        !           249: The first line describes the conflict, giving the state and the input symbol.
        !           250: The ordinary state description follows, giving
        !           251: the grammar rules active in the state, and the parser actions.
        !           252: Recall that the underline marks the
        !           253: portion of the grammar rules which has been seen.
        !           254: Thus in the example, in state 23 the parser has seen input corresponding
        !           255: to
        !           256: .DS
        !           257: IF  (  cond  )  stat
        !           258: .DE
        !           259: and the two grammar rules shown are active at this time.
        !           260: The parser can do two possible things.
        !           261: If the input symbol is
        !           262: .I ELSE ,
        !           263: it is possible to shift into state
        !           264: 45.
        !           265: State 45 will have, as part of its description, the line
        !           266: .DS
        !           267: stat  :  IF  (  cond  )  stat  ELSE\_stat
        !           268: .DE
        !           269: since the
        !           270: .I ELSE
        !           271: will have been shifted in this state.
        !           272: Back in state 23, the alternative action, described by ``\fB.\fR'',
        !           273: is to be done if the input symbol is not mentioned explicitly in the above actions; thus,
        !           274: in this case, if the input symbol is not
        !           275: .I ELSE ,
        !           276: the parser reduces by grammar rule 18:
        !           277: .DS
        !           278: stat  :  IF  \'(\'  cond  \')\'  stat
        !           279: .DE
        !           280: Once again, notice that the numbers following ``shift'' commands refer to other states,
        !           281: while the numbers following ``reduce'' commands refer to grammar
        !           282: rule numbers.
        !           283: In the
        !           284: .I y.output
        !           285: file, the rule numbers are printed after those rules which can be reduced.
        !           286: In most one states, there will be at most reduce action possible in the
        !           287: state, and this will be the default command.
        !           288: The user who encounters unexpected shift/reduce conflicts will probably want to
        !           289: look at the verbose output to decide whether the default actions are appropriate.
        !           290: In really tough cases, the user might need to know more about
        !           291: the behavior and construction of the parser than can be covered here.
        !           292: In this case, one of the theoretical references
        !           293: .[
        !           294: Aho Johnson Surveys Parsing
        !           295: .]
        !           296: .[
        !           297: Aho Johnson Ullman Deterministic Ambiguous
        !           298: .]
        !           299: .[
        !           300: Aho Ullman Principles Design
        !           301: .]
        !           302: might be consulted; the services of a local guru might also be appropriate.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.