|
|
1.1 ! root 1: .\" @(#)ss7 6.1 (Berkeley) 5/8/86 ! 2: .\" ! 3: .SH ! 4: 7: Error Handling ! 5: .PP ! 6: Error handling is an extremely difficult area, and many of the problems are semantic ones. ! 7: When an error is found, for example, it may be necessary to reclaim parse tree storage, ! 8: delete or alter symbol table entries, and, typically, set switches to avoid generating any further output. ! 9: .PP ! 10: It is seldom acceptable to stop all processing when an error is found; it is more useful to continue ! 11: scanning the input to find further syntax errors. ! 12: This leads to the problem of getting the parser ``restarted'' after an error. ! 13: A general class of algorithms to do this involves discarding a number of tokens ! 14: from the input string, and attempting to adjust the parser so that input can continue. ! 15: .PP ! 16: To allow the user some control over this process, ! 17: Yacc provides a simple, but reasonably general, feature. ! 18: The token name ``error'' is reserved for error handling. ! 19: This name can be used in grammar rules; ! 20: in effect, it suggests places where errors are expected, and recovery might take place. ! 21: The parser pops its stack until it enters a state where the token ``error'' is legal. ! 22: It then behaves as if the token ``error'' were the current lookahead token, ! 23: and performs the action encountered. ! 24: The lookahead token is then reset to the token that caused the error. ! 25: If no special error rules have been specified, the processing halts when an error is detected. ! 26: .PP ! 27: In order to prevent a cascade of error messages, the parser, after ! 28: detecting an error, remains in error state until three tokens have been successfully ! 29: read and shifted. ! 30: If an error is detected when the parser is already in error state, ! 31: no message is given, and the input token is quietly deleted. ! 32: .PP ! 33: As an example, a rule of the form ! 34: .DS ! 35: stat : error ! 36: .DE ! 37: would, in effect, mean that on a syntax error the parser would attempt to skip over the statement ! 38: in which the error was seen. ! 39: More precisely, the parser will ! 40: scan ahead, looking for three tokens that might legally follow ! 41: a statement, and start processing at the first of these; if ! 42: the beginnings of statements are not sufficiently distinctive, it may make a ! 43: false start in the middle of a statement, and end up reporting a ! 44: second error where there is in fact no error. ! 45: .PP ! 46: Actions may be used with these special error rules. ! 47: These actions might attempt to reinitialize tables, reclaim symbol table space, etc. ! 48: .PP ! 49: Error rules such as the above are very general, but difficult to control. ! 50: Somewhat easier are rules such as ! 51: .DS ! 52: stat : error \';\' ! 53: .DE ! 54: Here, when there is an error, the parser attempts to skip over the statement, but ! 55: will do so by skipping to the next \';\'. ! 56: All tokens after the error and before the next \';\' cannot be shifted, and are discarded. ! 57: When the \';\' is seen, this rule will be reduced, and any ``cleanup'' ! 58: action associated with it performed. ! 59: .PP ! 60: Another form of error rule arises in interactive applications, where ! 61: it may be desirable to permit a line to be reentered after an error. ! 62: A possible error rule might be ! 63: .DS ! 64: input : error \'\en\' { printf( "Reenter last line: " ); } input ! 65: { $$ = $4; } ! 66: .DE ! 67: There is one potential difficulty with this approach; ! 68: the parser must correctly process three input tokens before it ! 69: admits that it has correctly resynchronized after the error. ! 70: If the reentered line contains an error ! 71: in the first two tokens, the parser deletes the offending tokens, ! 72: and gives no message; this is clearly unacceptable. ! 73: For this reason, there is a mechanism that ! 74: can be used to force the parser ! 75: to believe that an error has been fully recovered from. ! 76: The statement ! 77: .DS ! 78: yyerrok ; ! 79: .DE ! 80: in an action ! 81: resets the parser to its normal mode. ! 82: The last example is better written ! 83: .DS ! 84: input : error \'\en\' ! 85: { yyerrok; ! 86: printf( "Reenter last line: " ); } ! 87: input ! 88: { $$ = $4; } ! 89: ; ! 90: .DE ! 91: .PP ! 92: As mentioned above, the token seen immediately ! 93: after the ``error'' symbol is the input token at which the ! 94: error was discovered. ! 95: Sometimes, this is inappropriate; for example, an ! 96: error recovery action might ! 97: take upon itself the job of finding the correct place to resume input. ! 98: In this case, ! 99: the previous lookahead token must be cleared. ! 100: The statement ! 101: .DS ! 102: yyclearin ; ! 103: .DE ! 104: in an action will have this effect. ! 105: For example, suppose the action after error ! 106: were to call some sophisticated resynchronization routine, ! 107: supplied by the user, that attempted to advance the input to the ! 108: beginning of the next valid statement. ! 109: After this routine was called, the next token returned by yylex would presumably ! 110: be the first token in a legal statement; ! 111: the old, illegal token must be discarded, and the error state reset. ! 112: This could be done by a rule like ! 113: .DS ! 114: stat : error ! 115: { resynch(); ! 116: yyerrok ; ! 117: yyclearin ; } ! 118: ; ! 119: .DE ! 120: .PP ! 121: These mechanisms are admittedly crude, but do allow for a simple, fairly effective recovery of the parser ! 122: from many errors; ! 123: moreover, the user can get control to deal with ! 124: the error actions required by other portions of the program.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.