|
|
1.1 ! root 1: ! 2: ! 3: lex Command lex ! 4: ! 5: ! 6: ! 7: ! 8: Lexical analyzer generator ! 9: ! 10: lleexx [-tt][-vv][_f_i_l_e] ! 11: cccc lleexx.yyyy.cc -llll ! 12: ! 13: Many programs, e.g., compilers, process highly structured input ! 14: according to rules. Two of the most complicated parts of such ! 15: programs are lexical analysis and parsing (also called syntax ! 16: analysis). The COHERENT system includes two powerful tools ! 17: called lex and yacc to help you construct these parts of a ! 18: program. lex converts a set of lexical rules into a lexical ! 19: analyzer, and yacc converts a set of parsing rules into a parser. ! 20: ! 21: The output of lex may be used directly, or may be used by a par- ! 22: ser generated by yacc. ! 23: ! 24: lex reads a specification from the given file (or from the stan- ! 25: dard input if none), and generates a C function called yylex(). ! 26: lex writes the generated function in the file lex.yy.c, or on ! 27: standard output if you use the -t option. The -v option prints ! 28: some statistics about the generated tables. ! 29: ! 30: The tutorial on lex that appear in this manual describes lex in ! 31: detail. In brief, the generated function yylex() matches por- ! 32: tions of its input to one pattern (sometimes called a regular ! 33: expression) from a set of rules, or context, and executes as- ! 34: sociated C commands. Unmatched portions of the input are copied ! 35: to the output stream. yylex() returns EOF when input has been ! 36: exhausted. ! 37: ! 38: lex uses the following macros that you may replace with the ! 39: preprocessor directive #undef if you wish: iinnppuutt() (read the ! 40: standard input stream), and oouuttppuutt(_c) (write the character c to ! 41: the standard output stream). You may also replace the following ! 42: functions if you wish: mmaaiinn() (main function), eerrrroorr(...) (print ! 43: error messages; takes same arguments as printf), and yyyywwrraapp() ! 44: (handle events at the end of a file). If an action is desired on ! 45: end of file, such as arranging for more input, yywrap() should ! 46: perform it, returning zero to keep going. ! 47: ! 48: A full lex specification has the following format: ! 49: ! 50: * Macro definitions, of the form: name pattern ! 51: ! 52: * Start condition declarations: %S NAME ... ! 53: ! 54: * Context declarations: %C NAME ... ! 55: ! 56: * Code to be included in the header section: %{ ! 57: anything ! 58: %} ! 59: <tab or space> anything ! 60: ! 61: ! 62: ! 63: ! 64: COHERENT Lexicon Page 1 ! 65: ! 66: ! 67: ! 68: ! 69: lex Command lex ! 70: ! 71: ! 72: ! 73: * Rules section delimiter (must always be present): %% ! 74: ! 75: * Code to appear at the start of yyyylleexx(): <tab or space> anything ! 76: ! 77: * Rulesfor initialcontext, inanyof theforms: rule ! 78: action; ! 79: rule | (means use next action) ! 80: rule { ! 81: <tab or space> action; ! 82: <tab or space> } ! 83: ! 84: * For each additional context: %C NAME ! 85: ...rules for this context... ! 86: ! 87: * End of rules section delimiter: %% ! 88: ! 89: * Code to be copied verbatim, such as user provided iinnppuutt(), ! 90: oouuttppuutt(), yyyywwrraapp(), or other. ! 91: ! 92: lex matches the longest string possible; if two rules match the ! 93: same length string, the rule specified first takes precedence. ! 94: lex puts the matched string, or token, in the char array ! 95: yytext[], and sets the variable yyleng to its length. ! 96: ! 97: Actions may use the following: ! 98: ! 99: ! 100: EECCHHOO Output the token ! 101: RREEJJEECCTT Perform action for lower precedence match ! 102: BBEEGGIINN _N_A_M_E Set start condition to _N_A_M_E ! 103: BBEEGGIINN 00 Clear start condition ! 104: yyyysswwiittcchh(_N_A_M_E) Switch to context _N_A_M_E, return current ! 105: yyyysswwiittcchh(00) Switch to initial context ! 106: yyyynneexxtt() Steal next character from input ! 107: yyyybbaacckk(_c) Put character _c back into input ! 108: yyyylleessss(_n) Reduce token length to _n, put rest back ! 109: yyyymmoorree() Append next token to this one ! 110: yyyyllooookk() Returns number of chars in input buffer ! 111: ! 112: ! 113: lex rules are contiguous strings of the form ! 114: ! 115: ! 116: [ <_N_A_M_E,...> ][ ^ ] _t_o_k_e_n [ /_l_o_o_k_a_h_e_a_d ][ $ ] ! 117: ! 118: ! 119: where brackets `[]' indicate optional items. ! 120: ! 121: ! 122: <_N_A_M_E,...> Match only under given start conditions ! 123: ^ Match the beginning of a line ! 124: $ Match the end of a line ! 125: _t_o_k_e_n Pattern that a given token is to match ! 126: /_l_o_o_k_a_h_e_a_d Pattern that given trailing text is to match ! 127: ! 128: ! 129: ! 130: COHERENT Lexicon Page 2 ! 131: ! 132: ! 133: ! 134: ! 135: lex Command lex ! 136: ! 137: ! 138: ! 139: ! 140: Pattern elements: ! 141: ! 142: ! 143: aa The character aa ! 144: \aa The character aa, even if special ! 145: . Any character except newline ! 146: [aabbxx-zz] Any of aa, bb, or xx through zz ! 147: [^aabbxx-zz]Any except aa, bb, or xx through zz ! 148: aabbcc The string aabbcc, even if any are special ! 149: {_n_a_m_e} The macro definition _n_a_m_e ! 150: (_e_x_p) The pattern _e_x_p (grouping operator) ! 151: ! 152: ! 153: Optional operators on elements: ! 154: ! 155: ! 156: _e? Zero or one occurrence of _e ! 157: _e* Zero or more consecutive _es ! 158: _e+ One or more consecutive _es ! 159: _e{_n} _n (a decimal number) consecutive _es ! 160: _e{_m,_n} _m through _n consecutive _es ! 161: ! 162: ! 163: Patterns may be of the form: ! 164: ! 165: ! 166: _e_1_e_2 Matches the sequence _e_1 _e_2 ! 167: _e_1|_e_2 Matches either _e_1 or _e_2 ! 168: ! 169: ! 170: lex recognizes the standard C escapes: \nn, \tt, \rr, \bb, \ff, and ! 171: \_o_o_o (octal representation). The special characters ! 172: ! 173: ! 174: \ ( ) < > { } % * + ? [ - ] ^ / $ . | ! 175: ! 176: ! 177: must be prefixed with \ or enclosed within quotation marks (ex- ! 178: cepting " and \) to be normal. Within classes, only the charac- ! 179: ters . ^ - \ and ] are special. ! 180: ! 181: ***** Files ***** ! 182: ! 183: /usr/lib/libl.a ! 184: ! 185: ***** See Also ***** ! 186: ! 187: commands, yacc ! 188: _I_n_t_r_o_d_u_c_t_i_o_n _t_o _l_e_x, _t_h_e _L_e_x_i_c_a_l _A_n_a_l_y_z_e_r ! 189: ! 190: ! 191: ! 192: ! 193: ! 194: ! 195: ! 196: COHERENT Lexicon Page 3 ! 197: ! 198:
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.