Annotation of researchv10no/cmd/sml/lib/lexgen/note, revision 1.1.1.1

1.1       root        1: From: sufrin%[email protected]
                      2: Message-Id: <[email protected]>
                      3: Date: Sat Dec 10 17:13:18 1988
                      4: To: dbm <@RELAY.CS.NET:[email protected]>
                      5: Subject: small suggestion for Andrew (re lexgen)
                      6: Reply-To: Bernard Sufrin <sufrin%[email protected]>
                      7: 
                      8: 
                      9: Dave: sorry to send stuff this way,
                     10: but I can't find Andrew's email address. Bernard
                     11: 
                     12: Lexgen is very useful. A very simple impro-
                     13: vement (which I've implemented here) is to
                     14: give programmers the option of avoiding con-
                     15: struction of a substring every time yytext
                     16: is bound. Whilst "substring" is the simplest
                     17: way to "internalise" a string, the program-
                     18: mer may be able to do better by taking the
                     19: "substring specification" (!yyb,i0,i-i0) and
                     20: internalising it by some other means (which
                     21: involves comparing the substring spec with
                     22: already-internalised strings by any of the
                     23: usual methods: I usually use open-hash +
                     24: linear search).
                     25: 
                     26: 
                     27: When employing this trick in a couple of
                     28: compilers and our proof-support system I've
                     29: found the consequent reduction of garbage
                     30: collector cycles to be well worthwhile
                     31: (instead of one substring construction per
                     32: symbol of the text being analysed we get one
                     33: substring constructed per DISTINCT symbol).
                     34: 
                     35: I've added an option in the lexgen defini-
                     36: tions section: %nosubstring turns the binding
                     37: of yytext from
                     38: 
                     39:        val yytext = substring(!yyb,i0,i-i0)
                     40: into
                     41:        val yytext = (!yyb,i0,i-i0:int)
                     42:        
                     43: 
                     44: Here's the diff: 
                     45: 
                     46: 92c94
                     47: <           COUNT | REJECT | FULLCHARSET | STRUCT
                     48: ---
                     49: >           COUNT | REJECT | FULLCHARSET | STRUCT | NOSUBSTRING
                     50: 118c120
                     51: <    val StrName = ref "Mlex"
                     52: ---
                     53: >    val StrName = ref "Mlex";
                     54: 120,121c122
                     55: <    val ResetFlags = fn () => (CountNewLines := false; HaveReject := false;
                     56: <                              CharSetSize := 128; StrName := "Mlex")
                     57: ---
                     58: >    (* Can obliterate substring operator *)
                     59: 122a124,129
                     60: >    val SubStrName = ref "substring";
                     61: > 
                     62: >    val ResetFlags = fn () =>
                     63: >    (CountNewLines := false; HaveReject := false;
                     64: >     CharSetSize := 128; StrName := "Mlex"; SubStrName := "substring" )
                     65: > 
                     66: 325a333,334
                     67: >                                  else if command = "nosubstring" then
                     68: >                                         NOSUBSTRING
                     69: 606a616,617
                     70: >               | NOSUBSTRING =>
                     71: >                          (SubStrName := ""; ParseDefs())
                     72: 958c969
                     73: <        sayln "\t\t\t(let val yytext = substring(!yyb,i0,i-i0)";
                     74: ---
                     75: >        say "\t\t\t(let val yytext = "; say(!SubStrName); sayln "(!yyb,i0,i-i0:int)";
                     76: 
                     77: 

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.