Annotation of 43BSDReno/share/doc/ps1/15.yacc/ss0, revision 1.1

1.1     ! root        1: .\"    @(#)ss0 6.1 (Berkeley) 5/8/86
        !             2: .\"
        !             3: .SH
        !             4: 0: Introduction
        !             5: .PP
        !             6: Yacc provides a general tool for imposing structure on the input to a computer program.
        !             7: The Yacc user prepares a
        !             8: specification of the input process; this includes rules
        !             9: describing the input structure, code to be invoked when these
        !            10: rules are recognized, and a low-level routine to do the
        !            11: basic input.
        !            12: Yacc then generates a function to control the input process.
        !            13: This function, called a
        !            14: .I parser ,
        !            15: calls the user-supplied low-level input routine
        !            16: (the
        !            17: .I "lexical analyzer" )
        !            18: to pick up the basic items
        !            19: (called
        !            20: .I tokens )
        !            21: from the input stream.
        !            22: These tokens are organized according to the input structure rules,
        !            23: called
        !            24: .I "grammar rules" \|;
        !            25: when one of these rules has been recognized,
        !            26: then user code supplied for this rule, an
        !            27: .I action ,
        !            28: is invoked; actions have the ability to return values and
        !            29: make use of the values of other actions.
        !            30: .PP
        !            31: Yacc is written in a portable dialect of C
        !            32: .[
        !            33: Ritchie Kernighan Language Prentice
        !            34: .]
        !            35: and the actions, and output subroutine, are in C as well.
        !            36: Moreover, many of the syntactic conventions of Yacc follow C.
        !            37: .PP
        !            38: The heart of the input specification is a collection of grammar rules.
        !            39: Each rule describes an allowable structure and gives it a name.
        !            40: For example, one grammar rule might be
        !            41: .DS
        !            42: date  :  month\_name  day  \',\'  year   ;
        !            43: .DE
        !            44: Here,
        !            45: .I date ,
        !            46: .I month\_name ,
        !            47: .I day ,
        !            48: and
        !            49: .I year
        !            50: represent structures of interest in the input process;
        !            51: presumably,
        !            52: .I month\_name ,
        !            53: .I day ,
        !            54: and
        !            55: .I year
        !            56: are defined elsewhere.
        !            57: The comma ``,'' is enclosed in single quotes; this implies that the
        !            58: comma is to appear literally in the input.
        !            59: The colon and semicolon merely serve as punctuation in the rule, and have
        !            60: no significance in controlling the input.
        !            61: Thus, with proper definitions, the input
        !            62: .DS
        !            63: July  4, 1776
        !            64: .DE
        !            65: might be matched by the above rule.
        !            66: .PP
        !            67: An important part of the input process is carried out by the
        !            68: lexical analyzer.
        !            69: This user routine reads the input stream, recognizing the lower level structures,
        !            70: and communicates these tokens
        !            71: to the parser.
        !            72: For historical reasons, a structure recognized by the lexical analyzer is called a
        !            73: .I "terminal symbol" ,
        !            74: while the structure recognized by the parser is called a
        !            75: .I "nonterminal symbol" .
        !            76: To avoid confusion, terminal symbols will usually be referred to as
        !            77: .I tokens .
        !            78: .PP
        !            79: There is considerable leeway in deciding whether to recognize structures using the lexical
        !            80: analyzer or grammar rules.
        !            81: For example, the rules
        !            82: .DS
        !            83: month\_name  :  \'J\' \'a\' \'n\'   ;
        !            84: month\_name  :  \'F\' \'e\' \'b\'   ;
        !            85: 
        !            86:          . . .
        !            87: 
        !            88: month\_name  :  \'D\' \'e\' \'c\'   ;
        !            89: .DE
        !            90: might be used in the above example.
        !            91: The lexical analyzer would only need to recognize individual letters, and
        !            92: .I month\_name
        !            93: would be a nonterminal symbol.
        !            94: Such low-level rules tend to waste time and space, and may
        !            95: complicate the specification beyond Yacc's ability to deal with it.
        !            96: Usually, the lexical analyzer would
        !            97: recognize the month names,
        !            98: and return an indication that a
        !            99: .I month\_name
        !           100: was seen; in this case,
        !           101: .I month\_name
        !           102: would be a token.
        !           103: .PP
        !           104: Literal characters such as ``,'' must also be passed through the lexical
        !           105: analyzer, and are also considered tokens.
        !           106: .PP
        !           107: Specification files are very flexible.
        !           108: It is realively easy to add to the above example the rule
        !           109: .DS
        !           110: date  :  month \'/\' day \'/\' year   ;
        !           111: .DE
        !           112: allowing
        !           113: .DS
        !           114: 7 / 4 / 1776
        !           115: .DE
        !           116: as a synonym for
        !           117: .DS
        !           118: July 4, 1776
        !           119: .DE
        !           120: In most cases, this new rule could be ``slipped in'' to a working system with minimal effort,
        !           121: and little danger of disrupting existing input.
        !           122: .PP
        !           123: The input being read may not conform to the
        !           124: specifications.
        !           125: These input errors are detected as early as is theoretically possible with a
        !           126: left-to-right scan;
        !           127: thus, not only is the chance of reading and computing with bad
        !           128: input data substantially reduced, but the bad data can usually be quickly found.
        !           129: Error handling,
        !           130: provided as part of the input specifications,
        !           131: permits the reentry of bad data,
        !           132: or the continuation of the input process after skipping over the bad data.
        !           133: .PP
        !           134: In some cases, Yacc fails to produce a parser when given a set of
        !           135: specifications.
        !           136: For example, the specifications may be self contradictory, or they may
        !           137: require a more powerful recognition mechanism than that available to Yacc.
        !           138: The former cases represent design errors;
        !           139: the latter cases
        !           140: can often be corrected
        !           141: by making
        !           142: the lexical analyzer
        !           143: more powerful, or by rewriting some of the grammar rules.
        !           144: While Yacc cannot handle all possible specifications, its power
        !           145: compares favorably with similar systems;
        !           146: moreover, the
        !           147: constructions which are difficult for Yacc to handle are
        !           148: also frequently difficult for human beings to handle.
        !           149: Some users have reported that the discipline of formulating valid
        !           150: Yacc specifications for their input revealed errors of
        !           151: conception or design early in the program development.
        !           152: .PP
        !           153: The theory underlying Yacc has been described elsewhere.
        !           154: .[
        !           155: Aho Johnson Surveys LR Parsing
        !           156: .]
        !           157: .[
        !           158: Aho Johnson Ullman Ambiguous Grammars
        !           159: .]
        !           160: .[
        !           161: Aho Ullman Principles Compiler Design
        !           162: .]
        !           163: Yacc has been extensively used in numerous practical applications,
        !           164: including
        !           165: .I lint ,
        !           166: .[
        !           167: Johnson Lint
        !           168: .]
        !           169: the Portable C Compiler,
        !           170: .[
        !           171: Johnson Portable Compiler Theory
        !           172: .]
        !           173: and a system for typesetting mathematics.
        !           174: .[
        !           175: Kernighan Cherry typesetting system CACM
        !           176: .]
        !           177: .PP
        !           178: The next several sections describe the
        !           179: basic process of preparing a Yacc specification;
        !           180: Section 1 describes the preparation of grammar rules,
        !           181: Section 2 the preparation of the user supplied actions associated with these rules,
        !           182: and Section 3 the preparation of lexical analyzers.
        !           183: Section 4 describes the operation of the parser.
        !           184: Section 5 discusses various reasons why Yacc may be unable to produce a
        !           185: parser from a specification, and what to do about it.
        !           186: Section 6 describes a simple mechanism for
        !           187: handling operator precedences in arithmetic expressions.
        !           188: Section 7 discusses error detection and recovery.
        !           189: Section 8 discusses the operating environment and special features
        !           190: of the parsers Yacc produces.
        !           191: Section 9 gives some suggestions which should improve the
        !           192: style and efficiency of the specifications.
        !           193: Section 10 discusses some advanced topics, and Section 11 gives
        !           194: acknowledgements.
        !           195: Appendix A has a brief example, and Appendix B gives a
        !           196: summary of the Yacc input syntax.
        !           197: Appendix C gives an example using some of the more advanced
        !           198: features of Yacc, and, finally,
        !           199: Appendix D describes mechanisms and syntax
        !           200: no longer actively supported, but
        !           201: provided for historical continuity with older versions of Yacc.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.