Annotation of 43BSDReno/share/doc/ps1/15.yacc/ss7, revision 1.1.1.1

1.1       root        1: .\"    @(#)ss7 6.1 (Berkeley) 5/8/86
                      2: .\"
                      3: .SH
                      4: 7: Error Handling
                      5: .PP
                      6: Error handling is an extremely difficult area, and many of the problems are semantic ones.
                      7: When an error is found, for example, it may be necessary to reclaim parse tree storage,
                      8: delete or alter symbol table entries, and, typically, set switches to avoid generating any further output.
                      9: .PP
                     10: It is seldom acceptable to stop all processing when an error is found; it is more useful to continue
                     11: scanning the input to find further syntax errors.
                     12: This leads to the problem of getting the parser ``restarted'' after an error.
                     13: A general class of algorithms to do this involves discarding a number of tokens
                     14: from the input string, and attempting to adjust the parser so that input can continue.
                     15: .PP
                     16: To allow the user some control over this process,
                     17: Yacc provides a simple, but reasonably general, feature.
                     18: The token name ``error'' is reserved for error handling.
                     19: This name can be used in grammar rules;
                     20: in effect, it suggests places where errors are expected, and recovery might take place.
                     21: The parser pops its stack until it enters a state where the token ``error'' is legal.
                     22: It then behaves as if the token ``error'' were the current lookahead token,
                     23: and performs the action encountered.
                     24: The lookahead token is then reset to the token that caused the error.
                     25: If no special error rules have been specified, the processing halts when an error is detected.
                     26: .PP
                     27: In order to prevent a cascade of error messages, the parser, after
                     28: detecting an error, remains in error state until three tokens have been successfully
                     29: read and shifted.
                     30: If an error is detected when the parser is already in error state,
                     31: no message is given, and the input token is quietly deleted.
                     32: .PP
                     33: As an example, a rule of the form
                     34: .DS
                     35: stat   :       error
                     36: .DE
                     37: would, in effect, mean that on a syntax error the parser would attempt to skip over the statement
                     38: in which the error was seen.
                     39: More precisely, the parser will
                     40: scan ahead, looking for three tokens that might legally follow
                     41: a statement, and start processing at the first of these; if
                     42: the beginnings of statements are not sufficiently distinctive, it may make a
                     43: false start in the middle of a statement, and end up reporting a
                     44: second error where there is in fact no error.
                     45: .PP
                     46: Actions may be used with these special error rules.
                     47: These actions might attempt to reinitialize tables, reclaim symbol table space, etc.
                     48: .PP
                     49: Error rules such as the above are very general, but difficult to control.
                     50: Somewhat easier are rules such as
                     51: .DS
                     52: stat   :       error  \';\'
                     53: .DE
                     54: Here, when there is an error, the parser attempts to skip over the statement, but
                     55: will do so by skipping to the next \';\'.
                     56: All tokens after the error and before the next \';\' cannot be shifted, and are discarded.
                     57: When the \';\' is seen, this rule will be reduced, and any ``cleanup''
                     58: action associated with it performed.
                     59: .PP
                     60: Another form of error rule arises in interactive applications, where
                     61: it may be desirable to permit a line to be reentered after an error.
                     62: A possible error rule might be
                     63: .DS
                     64: input  :       error  \'\en\'  {  printf( "Reenter last line: " );  }  input
                     65:                        {       $$  =  $4;  }
                     66: .DE
                     67: There is one potential difficulty with this approach;
                     68: the parser must correctly process three input tokens before it
                     69: admits that it has correctly resynchronized after the error.
                     70: If the reentered line contains an error
                     71: in the first two tokens, the parser deletes the offending tokens,
                     72: and gives no message; this is clearly unacceptable.
                     73: For this reason, there is a mechanism that
                     74: can be used to force the parser
                     75: to believe that an error has been fully recovered from.
                     76: The statement
                     77: .DS
                     78: yyerrok ;
                     79: .DE
                     80: in an action
                     81: resets the parser to its normal mode.
                     82: The last example is better written
                     83: .DS
                     84: input  :       error  \'\en\'
                     85:                        {       yyerrok;
                     86:                                printf( "Reenter last line: " );   }
                     87:                input
                     88:                        {       $$  =  $4;  }
                     89:        ;
                     90: .DE
                     91: .PP
                     92: As mentioned above, the token seen immediately
                     93: after the ``error'' symbol is the input token at which the
                     94: error was discovered.
                     95: Sometimes, this is inappropriate; for example, an
                     96: error recovery action might
                     97: take upon itself the job of finding the correct place to resume input.
                     98: In this case,
                     99: the previous lookahead token must be cleared.
                    100: The statement
                    101: .DS
                    102: yyclearin ;
                    103: .DE
                    104: in an action will have this effect.
                    105: For example, suppose the action after error
                    106: were to call some sophisticated resynchronization routine,
                    107: supplied by the user, that attempted to advance the input to the
                    108: beginning of the next valid statement.
                    109: After this routine was called, the next token returned by yylex would presumably
                    110: be the first token in a legal statement;
                    111: the old, illegal token must be discarded, and the error state reset.
                    112: This could be done by a rule like
                    113: .DS
                    114: stat   :       error 
                    115:                        {       resynch();
                    116:                                yyerrok ;
                    117:                                yyclearin ;   }
                    118:        ;
                    119: .DE
                    120: .PP
                    121: These mechanisms are admittedly crude, but do allow for a simple, fairly effective recovery of the parser
                    122: from many errors;
                    123: moreover, the user can get control to deal with
                    124: the error actions required by other portions of the program.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.