|
|
1.1 ! root 1: This is minimal documentation for the lexer driver produced by ml-lex. ! 2: ! 3: Main data structures: ! 4: ! 5: The transition table is stored in tab. Tab is an array of records, indexed ! 6: by state number. The first field of the record, fin, is a list of final leaves ! 7: assocated with it. The second field of the record, trans, is a transition ! 8: table for the state indexed by character number. It gives the next state ! 9: for a given input character. ! 10: ! 11: The usual initial start state is state #1. State 0 is a dead state, which ! 12: has transitions only to itself. ! 13: ! 14: The field yyfin has type yyfinstate list. yyfinstate consists of the ! 15: following three constructors: ! 16: ! 17: * N of int - indicates normal end leaf. ! 18: * D of int - dummy end leaf - for indicating when an end state for ! 19: a trailing context regular expression has been reached. These are ! 20: stored and propagated backwards when action is executed. ! 21: * T of int - indicates an actual end leaf for a trailing context reg. ! 22: expression, which should be executed only if D i was encountered ! 23: after this end leaf while scanning forward. The dummy end leaf is ! 24: removed from the backward propagating list after this node is ! 25: encountered. ! 26: ! 27: ! 28: The function scan inside the function lex operates as a transition ! 29: function, scanning the input until it is no longer possible to take any ! 30: more transitions. It accumulates a list of the accepting leaf list ! 31: associated with each accepting state passed through. ! 32: ! 33: Scan operates as follows: ! 34: ! 35: Input: * s - current state ! 36: * AcceptingLeaves - list of accepting leave lists. Each state ! 37: has a list of accepting leaves associated with it. This list ! 38: may be nil if the state is not a final state. ! 39: * l - position of the next character in the buffer b to read ! 40: * i0 - starting position in the buffer. ! 41: ! 42: Output: If no match is found, it raises the exception LexError. ! 43: Otherwise, it returns a value of type lexresult. ! 44: ! 45: It operates as a transtion function: ! 46: It (1) adds the list of accepting leaves for the current state to ! 47: the list of accepting leave lists ! 48: (2) tries to make a transition on the current input character ! 49: to the next state. If it can't make a transition, it ! 50: executes the action function. ! 51: (a) - if it is past the end of the buffer, it ! 52: (1) checks if it as at end eof. If it is then: ! 53: It checks to see if it has made any ! 54: transitions since it was first called - ! 55: (l>i0 when this is true.) If it hasn't ! 56: this implies that scan was called at ! 57: the end of file. It thus executes ! 58: eof function declared by the user. ! 59: Otherwise it must execute action w/ ! 60: the current accepting state list. ! 61: (2) otherwise it reads a block of up to 1024 ! 62: characters, and appends this block to the ! 63: useful suffix of characters left in the ! 64: buffer (those character which have been ! 65: scanned in this call to lex()). The buffer ! 66: operation should be altered if one intends ! 67: to process reg. expressions whose lexemes' ! 68: length will be >> 1024. For most normal ! 69: applications, the buffer update operation ! 70: will be fine. ! 71: ! 72: This buffer update operation requires ! 73: O(n^2/1024) char. copies for lexemes > 1024 ! 74: characters in length, and O(n) char. copies ! 75: for lexemes <= 1024 characters in length. ! 76: It can be made O(n) using linked list ! 77: buffers & a Byte.array of size n (not the ! 78: ^operator!) for concatenating the buffers ! 79: to return a value for yytext when a lexeme ! 80: is longer than the typical buffer length. ! 81: ! 82: (3) If the transition is to a dead state (0 is used ! 83: for the dead state), action is executed instead.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.