File:  [MW Coherent from dump] / coherent / a / usr / man / COHERENT / lex
Revision 1.1.1.1 (vendor branch): download - view: text, annotated - select for diffs
Wed May 29 04:56:34 2019 UTC (7 years ago) by root
Branches: MarkWilliams, MAIN
CVS tags: relic, HEAD
coherent



lex                          Command                          lex




Lexical analyzer generator

lleexx [-tt][-vv][_f_i_l_e]
cccc lleexx.yyyy.cc -llll

Many programs,  e.g., compilers, process  highly structured input
according to  rules.  Two of  the most complicated  parts of such
programs  are lexical  analysis and  parsing (also  called syntax
analysis).   The  COHERENT  system  includes two  powerful  tools
called  lex and  yacc  to help  you  construct these  parts of  a
program.   lex converts  a set  of lexical  rules into  a lexical
analyzer, and yacc converts a set of parsing rules into a parser.

The output of lex may be  used directly, or may be used by a par-
ser generated by yacc.

lex reads a specification from  the given file (or from the stan-
dard input  if none), and generates a  C function called yylex().
lex writes  the generated  function in  the file lex.yy.c,  or on
standard output  if you use the -t option.   The -v option prints
some statistics about the generated tables.

The tutorial  on lex that appear in this  manual describes lex in
detail.  In  brief, the  generated function yylex()  matches por-
tions of  its input  to one  pattern (sometimes called  a regular
expression) from  a set  of rules,  or context, and  executes as-
sociated C commands.   Unmatched portions of the input are copied
to the  output stream.  yylex()  returns EOF when  input has been
exhausted.

lex  uses the  following  macros that  you may  replace with  the
preprocessor  directive #undef  if  you wish:  iinnppuutt() (read  the
standard input  stream), and oouuttppuutt(_c) (write  the character c to
the standard output  stream).  You may also replace the following
functions if you  wish: mmaaiinn() (main function), eerrrroorr(...) (print
error  messages; takes  same arguments  as printf),  and yyyywwrraapp()
(handle events at the end of a file).  If an action is desired on
end of  file, such as  arranging for more  input, yywrap() should
perform it, returning zero to keep going.

A full lex specification has the following format:

*  Macro definitions, of the form:       name    pattern

*  Start condition declarations: %S      NAME ...

*  Context declarations: %C      NAME ...

*  Code to be included in the header section:    %{
           anything
           %}
           <tab or space> anything




COHERENT Lexicon                                           Page 1




lex                          Command                          lex



*  Rules section delimiter (must always be present):     %%

*  Code to appear at the start of yyyylleexx():       <tab or space> anything

* Rulesfor initialcontext, inanyof theforms:       rule
   action;
           rule            | (means use next action)
           rule            {
           <tab or space>  action;
           <tab or space>  }

*  For each additional context:  %C      NAME
           ...rules for this context...

*  End of rules section delimiter:       %%

*  Code  to be  copied verbatim, such  as user  provided iinnppuutt(),
   oouuttppuutt(), yyyywwrraapp(), or other.

lex matches  the longest string possible; if  two rules match the
same length  string, the  rule specified first  takes precedence.
lex  puts  the  matched  string,  or  token, in  the  char  array
yytext[], and sets the variable yyleng to its length.

Actions may use the following:


     EECCHHOO           Output the token
     RREEJJEECCTT         Perform action for lower precedence match
     BBEEGGIINN _N_A_M_E     Set start condition to _N_A_M_E
     BBEEGGIINN 00        Clear start condition
     yyyysswwiittcchh(_N_A_M_E) Switch to context _N_A_M_E, return current
     yyyysswwiittcchh(00)    Switch to initial context
     yyyynneexxtt()       Steal next character from input
     yyyybbaacckk(_c)      Put character _c back into input
     yyyylleessss(_n)      Reduce token length to _n, put rest back
     yyyymmoorree()       Append next token to this one
     yyyyllooookk()       Returns number of chars in input buffer


lex rules are contiguous strings of the form


     [ <_N_A_M_E,...> ][ ^ ] _t_o_k_e_n [ /_l_o_o_k_a_h_e_a_d ][ $ ]


where brackets `[]' indicate optional items.


     <_N_A_M_E,...>     Match only under given start conditions
     ^              Match the beginning of a line
     $              Match the end of a line
     _t_o_k_e_n          Pattern that a given token is to match
     /_l_o_o_k_a_h_e_a_d     Pattern that given trailing text is to match



COHERENT Lexicon                                           Page 2




lex                          Command                          lex




Pattern elements:


     aa       The character aa
     \aa      The character aa, even if special
     .       Any character except newline
     [aabbxx-zz]  Any of aa, bb, or xx through zz
     [^aabbxx-zz]Any except aa, bb, or xx through zz
     aabbcc     The string aabbcc, even if any are special
     {_n_a_m_e}  The macro definition _n_a_m_e
     (_e_x_p)   The pattern _e_x_p (grouping operator)


Optional operators on elements:


     _e?      Zero or one occurrence of _e
     _e*      Zero or more consecutive _es
     _e+      One or more consecutive _es
     _e{_n}    _n (a decimal number) consecutive _es
     _e{_m,_n}  _m through _n consecutive _es


Patterns may be of the form:


     _e_1_e_2    Matches the sequence _e_1 _e_2
     _e_1|_e_2   Matches either _e_1 or _e_2


lex recognizes  the standard C  escapes: \nn, \tt, \rr,  \bb, \ff, and
\_o_o_o (octal representation).  The special characters


         \ ( ) < > { } % * + ? [ - ] ^ / $ . |


must be  prefixed with \ or enclosed  within quotation marks (ex-
cepting " and \) to  be normal.  Within classes, only the charac-
ters . ^ - \ and ] are special.

***** Files *****

/usr/lib/libl.a

***** See Also *****

commands, yacc
_I_n_t_r_o_d_u_c_t_i_o_n _t_o _l_e_x, _t_h_e _L_e_x_i_c_a_l _A_n_a_l_y_z_e_r







COHERENT Lexicon                                           Page 3



unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.