|
|
1.1 ! root 1: .\" Copyright (c) 1980 Regents of the University of California. ! 2: .\" All rights reserved. The Berkeley software License Agreement ! 3: .\" specifies the terms and conditions for redistribution. ! 4: .\" ! 5: .\" @(#)ch7.n 6.2 (Berkeley) 5/14/86 ! 6: .\" ! 7: ." $Header: ch7.n,v 1.3 83/07/01 11:22:58 layer Exp $ ! 8: .Lc The\ Lisp\ Reader 7 ! 9: .sh 2 Introduction \n(ch 1 ! 10: .pp ! 11: The ! 12: .i read ! 13: function is responsible for converting ! 14: a stream of ! 15: characters into a Lisp expression. ! 16: .i Read ! 17: is table driven and the table it uses is called a ! 18: .i readtable. ! 19: The ! 20: .i print ! 21: function does the ! 22: inverse of ! 23: .i read ; ! 24: it converts a Lisp expression into a stream of ! 25: characters. ! 26: Typically the conversion is done in such ! 27: a way that if that stream of characters were read by ! 28: .i read , ! 29: the ! 30: result would be an expression equal to the one ! 31: .i print ! 32: was given. ! 33: .i Print ! 34: must also refer to the readtable in order to determine ! 35: how to format its output. ! 36: The ! 37: .i explode ! 38: function, which returns a list of characters rather than ! 39: printing them, must also refer to the readtable. ! 40: .pp ! 41: A readtable is created ! 42: with the ! 43: .i makereadtable ! 44: function, modified with the ! 45: .i setsyntax ! 46: function and interrogated with the ! 47: .i getsyntax ! 48: function. ! 49: The structure of a readtable is hidden from the user - a ! 50: readtable should ! 51: only be manipulated with the three functions mentioned above. ! 52: .pp ! 53: There is one distinguished readtable called the ! 54: .i current ! 55: .i readtable ! 56: whose value determines what ! 57: .i read , ! 58: .i print ! 59: and ! 60: .i explode ! 61: do. ! 62: The current readtable is the value of the symbol ! 63: .i readtable . ! 64: Thus it is possible to rapidly change ! 65: the current syntax by lambda binding ! 66: a different readtable to the symbol ! 67: .i readtable. ! 68: When the binding is undone, the syntax reverts to its old form. ! 69: .sh +0 Syntax\ Classes ! 70: .pp ! 71: The readtable describes how each of the 128 ascii characters should ! 72: be treated by the reader and printer. ! 73: Each character belongs to a ! 74: .i syntax ! 75: .i class ! 76: which has three properties: ! 77: .ip character\ class\ - ! 78: Tells what the reader should do when it sees this character. ! 79: There are a large number of character classes. ! 80: They are described below. ! 81: .ip separator\ - ! 82: Most types of tokens the reader constructs are one character ! 83: long. ! 84: Four token types have an arbitrary length: number (1234), ! 85: symbol print name (franz), ! 86: escaped symbol print name (|franz|), and string ("franz"). ! 87: The reader can easily determine when it has ! 88: come to the ! 89: end of one of the last two types: it just looks for the ! 90: matching delimiter (| or "). ! 91: When the reader is reading a number or symbol print name, it ! 92: stops reading when it comes to a character with the ! 93: .i separator ! 94: property. ! 95: The separator character is pushed back into the input stream and will ! 96: be the first character read when the reader is called again. ! 97: .ip escape\ - ! 98: Tells the printer when to put escapes in front of, or around, a symbol ! 99: whose print name contains this character. ! 100: There are three possibilities: always escape a symbol with this character ! 101: in it, only escape a symbol if this is the only character in the symbol, ! 102: and only escape a symbol if this is the first character in the symbol. ! 103: [note: The printer will always escape a symbol which, if printed out, would ! 104: look like a valid number.] ! 105: .pp ! 106: When the Lisp system is built, Lisp code is added to a C-coded kernel ! 107: and the result becomes the standard lisp system. ! 108: The readtable present in the C-coded kernel, called the ! 109: .i raw ! 110: .i readtable , ! 111: contains the bare necessities for reading in Lisp code. ! 112: During the ! 113: construction of the complete Lisp system, ! 114: a copy is made of the raw readtable and ! 115: then the copy is modified by adding macro characters. ! 116: The result is what is called the ! 117: .i standard ! 118: .i readtable . ! 119: When a new readtable is created with ! 120: .i makereadtable, ! 121: a copy is made of either the ! 122: raw readtable ! 123: or the current readtable (which is likely to be the standard readtable). ! 124: .sh +0 Reader\ Operations ! 125: .pp ! 126: The reader has a very simple algorithm. ! 127: It is either ! 128: .i scanning ! 129: for a token, ! 130: .i collecting ! 131: a token, ! 132: or ! 133: .i processing ! 134: a token. ! 135: Scanning involves reading characters and throwing ! 136: away those which don't start tokens (such as blanks and tabs). ! 137: Collecting means gathering the characters which make up a ! 138: token into a buffer. ! 139: Processing may involve creating symbols, strings, lists, ! 140: fixnums, bignums or flonums or calling a user written function called ! 141: a character macro. ! 142: .pp ! 143: The components of the syntax class determine when the reader ! 144: switches between the scanning, collecting and processing states. ! 145: The reader will continue scanning as long as the character class ! 146: of the characters it reads is ! 147: .i cseparator. ! 148: When it reads a character whose character class is not ! 149: .i cseparator ! 150: it stores that character in its buffer and begins the collecting phase. ! 151: .pp ! 152: If the character class of that first character is ! 153: .i ccharacter , ! 154: .i cnumber , ! 155: .i cperiod , ! 156: or ! 157: .i csign . ! 158: then it will continue collecting until it runs into a character whose ! 159: syntax class has the ! 160: .i separator ! 161: property. ! 162: (That last character will be pushed back into the input buffer and will ! 163: be the first character read next time.) ! 164: Now the reader goes into the processing phase, checking to see if the ! 165: token it read is a number or symbol. ! 166: It is important to note that after ! 167: the first character is collected the component of the syntax class which ! 168: tells the reader to stop ! 169: collecting is the ! 170: .i separator ! 171: property, not the character class. ! 172: .pp ! 173: If the character class of the character which stopped the scanning is not ! 174: .i ccharacter , ! 175: .i cnumber , ! 176: .i cperiod , ! 177: or ! 178: .i csign . ! 179: then the reader processes that character immediately. ! 180: The character classes ! 181: .i csingle-macro , ! 182: .i csingle-splicing-macro , ! 183: and ! 184: .i csingle-infix-macro ! 185: will act like ! 186: .i ccharacter ! 187: if the following token is not a ! 188: .i separator. ! 189: The processing which is done for a given character class ! 190: is described in detail in the next section. ! 191: .sh +0 Character\ Classes ! 192: .de Cc ! 193: .sp 2v ! 194: .tl '\fI\\$1\fP''raw readtable:\\$2' ! 195: .tl '''standard readtable:\\$3' ! 196: .. ! 197: .pc ! 198: .Cc ccharacter A-Z\ a-z\ ^H\ !#$%&*,/:;<=>?@^_`{}~ A-Z\ a-z\ ^H\ !$%&*/:;<=>?@^_{}~ ! 199: .pc % ! 200: A normal character. ! 201: .Cc cnumber 0-9 0-9 ! 202: This type is a digit. ! 203: The syntax for an integer (fixnum or bignum) is a string of ! 204: .i cnumber ! 205: characters optionally followed by a ! 206: .i cperiod. ! 207: If the digits are not followed by a ! 208: .i cperiod , ! 209: then they are interpreted in base ! 210: .i ibase ! 211: which must be eight or ten. ! 212: The syntax for a floating point number is ! 213: either zero or more ! 214: .i cnumber 's ! 215: followed by a ! 216: .i cperiod ! 217: and then followed by one or more ! 218: .i cnumber 's. ! 219: A floating point number ! 220: may also be an integer or floating point number followed ! 221: by 'e' or 'd', an optional '+' or '\-' ! 222: and then zero or more ! 223: .i cnumber 's. ! 224: .Cc csign +\- +\- ! 225: A leading sign for a number. ! 226: No other characters should be given this class. ! 227: .Cc cleft-paren ( ( ! 228: A left parenthesis. ! 229: Tells the reader to begin forming a list. ! 230: .Cc cright-paren ) ) ! 231: A right parenthesis. ! 232: Tells the reader that it has reached the end of a list. ! 233: .Cc cleft-bracket [ [ ! 234: A left bracket. ! 235: Tells the reader that it should begin forming a list. ! 236: See the description of ! 237: .i cright-bracket ! 238: for the difference between cleft-bracket and cleft-paren. ! 239: .Cc cright-bracket ] ] ! 240: A right bracket. ! 241: A ! 242: .i cright-bracket ! 243: finishes the formation of the current ! 244: list and all enclosing lists until it finds one which ! 245: begins with a ! 246: .i cleft-bracket ! 247: or until it reaches the ! 248: top level list. ! 249: .Cc cperiod . . ! 250: The period is used to separate element of a cons cell ! 251: [e.g. (a\ .\ (b\ .\ nil)) is the same as (a\ b)]. ! 252: .i cperiod ! 253: is also used in numbers as described above. ! 254: .Cc cseparator ^I-^M\ esc\ space ^I-^M\ esc\ space ! 255: Separates tokens. When the reader is scanning, these character ! 256: are passed over. ! 257: Note: there is a difference between the ! 258: .i cseparator ! 259: character class and the ! 260: .i separator ! 261: property of a syntax class. ! 262: .Cc csingle-quote \\' \\' ! 263: This causes ! 264: .i read ! 265: to be called recursively and the list ! 266: (quote <value read>) to be returned. ! 267: .Cc csymbol-delimiter | | ! 268: This causes the reader to begin collecting characters and to stop only ! 269: when another identical ! 270: .i csymbol-delimiter ! 271: is seen. ! 272: The only way to escape a ! 273: .i csymbol-delimiter ! 274: within a symbol name is with a ! 275: .i cescape ! 276: character. ! 277: The collected characters are converted into a string which becomes ! 278: the print name of a symbol. ! 279: If a symbol with an identical print name already exists, then the ! 280: allocation is not done, rather the existing symbol is used. ! 281: .Cc cescape \e \e ! 282: This causes the next character to read in to be treated as a ! 283: .b vcharacter . ! 284: A character whose syntax class is ! 285: .b vcharacter ! 286: has a character class ! 287: .i ccharacter ! 288: and does not have ! 289: the ! 290: .i separator ! 291: property so it will not separate symbols. ! 292: .Cc cstring-delimiter """" """" ! 293: This is the same as ! 294: .i csymbol-delimiter ! 295: except the result is returned as a string instead of a symbol. ! 296: .Cc csingle-character-symbol none none ! 297: This returns a symbol whose print name is the the single character ! 298: which has been collected. ! 299: .Cc cmacro none `, ! 300: The reader calls the macro function associated with this character and ! 301: the current readtable, passing it no arguments. ! 302: The result of the macro is added to the structure the reader is building, ! 303: just as if that form were directly read by the reader. ! 304: More details on macros are provided below. ! 305: .Cc csplicing-macro none #; ! 306: A ! 307: .i csplicing-macro ! 308: differs from a ! 309: .i cmacro ! 310: in the way the result is incorporated in the structure the reader is ! 311: building. ! 312: A ! 313: .i csplicing-macro ! 314: must return a list of forms (possibly empty). ! 315: The reader acts as ! 316: if it read each element of ! 317: the list itself without ! 318: the surrounding parenthesis. ! 319: .Cc csingle-macro none none ! 320: This causes to reader to check the next character. ! 321: If it is a ! 322: .i cseparator ! 323: then this acts like a ! 324: .i cmacro. ! 325: Otherwise, it acts like a ! 326: .i ccharacter. ! 327: .Cc csingle-splicing-macro none none ! 328: This is triggered like a ! 329: .i csingle-macro ! 330: however the result is spliced in like a ! 331: .i csplicing-macro. ! 332: .Cc cinfix-macro none none ! 333: This is differs from a ! 334: .i cmacro ! 335: in that the macro function is passed a form representing what the reader ! 336: has read so far. ! 337: The result of the macro replaces what the reader had read so far. ! 338: .Cc csingle-infix-macro none none ! 339: This differs from the ! 340: .i cinfix-macro ! 341: in that the macro will only be triggered if the character following the ! 342: .i csingle-infix-macro ! 343: character is a ! 344: .i cseparator . ! 345: .Cc cillegal ^@-^G^N-^Z^\e-^_rubout ^@-^G^N-^Z^\e-^_rubout ! 346: The characters cause the reader to signal an error if read. ! 347: .sh +0 Syntax\ Classes ! 348: .pp ! 349: The readtable maps each character into a syntax class. ! 350: The syntax class contains three pieces of information: ! 351: the character class, whether this is a separator, and the escape ! 352: properties. ! 353: The first two properties are used by the reader, the last by ! 354: the printer (and ! 355: .i explode ). ! 356: The initial lisp system has the following syntax classes defined. ! 357: The user may add syntax classes with ! 358: .i add-syntax-class . ! 359: For each syntax class, we list the properties of the class and ! 360: which characters have this syntax class by default. ! 361: More information about each syntax class can be found under the ! 362: description of the syntax class's character class. ! 363: .de Sy ! 364: .sp 1v ! 365: .(b ! 366: .tl '\fB\\$1\fP''raw readtable:\\$2' ! 367: .tl '\fI\\$4\fP''standard readtable:\\$3' ! 368: .tl '\fI\\$5\fP''' ! 369: .if \n(.$>5 .tl '\fI\\$6\fP''' ! 370: .)b ! 371: .. ! 372: .pc ! 373: .Sy vcharacter A-Z\ a-z\ ^H\ !#$%&*,/:;<=>?@^_`{}~ A-Z\ a-z\ ^H\ !$%&*/:;<=>?@^_{}~ ccharacter ! 374: .pc % ! 375: .Sy vnumber 0-9 0-9 cnumber ! 376: .Sy vsign +- +- csign ! 377: .Sy vleft-paren ( ( cleft-paren escape-always separator ! 378: .Sy vright-paren ) ) cright-paren escape-always separator ! 379: .Sy vleft-bracket [ [ cleft-bracket escape-always separator ! 380: .Sy vright-bracket ] ] cright-bracket escape-always separator ! 381: .Sy vperiod . . cperiod escape-when-unique ! 382: .Sy vseparator ^I-^M\ esc\ space ^I-^M\ esc\ space cseparator escape-always separator ! 383: .Sy vsingle-quote \\' \\' csingle-quote escape-always separator ! 384: .Sy vsymbol-delimiter | | csingle-delimiter escape-always ! 385: .Sy vescape \e \e cescape escape-always ! 386: .Sy vstring-delimiter """" """" cstring-delimiter escape-always ! 387: .Sy vsingle-character-symbol none none csingle-character-symbol separator ! 388: .Sy vmacro none `, cmacro escape-always separator ! 389: .Sy vsplicing-macro none #; csplicing-macro escape-always separator ! 390: .Sy vsingle-macro none none csingle-macro escape-when-unique ! 391: .Sy vsingle-splicing-macro none none csingle-splicing-macro escape-when-unique ! 392: .Sy vinfix-macro none none cinfix-macro escape-always separator ! 393: .Sy vsingle-infix-macro none none csingle-infix-macro escape-when-unique ! 394: .Sy villegal ^@-^G^N-^Z^\e-^_rubout ^@-^G^N-^Z^\e-^_rubout cillegal escape-always separator ! 395: .sh +0 Character\ Macros ! 396: .pp ! 397: Character macros are ! 398: user written functions which are executed during the reading process. ! 399: The value returned by a character macro may or may not be used by ! 400: the reader, depending on the type of macro and the value returned. ! 401: Character macros are always attached to a single character with ! 402: the ! 403: .i setsyntax ! 404: function. ! 405: .sh +1 Types ! 406: There are three types of character macros: normal, splicing and infix. ! 407: These types differ in the arguments they are given or in what is done ! 408: with the result they return. ! 409: .sh +1 Normal ! 410: .pp ! 411: A normal macro ! 412: is passed no arguments. ! 413: The value returned by a normal macro is simply used by ! 414: the reader as if it had read the value itself. ! 415: Here is an example of a macro which returns the abbreviation ! 416: for a given state. ! 417: .Eb ! 418: \->\fI(de\kAfun stateabbrev nil ! 419: \h'|\nAu'(cdr (assq (read) '((california . ca) (pennsylvania . pa)))))\fP ! 420: stateabbrev ! 421: \-> \fI(setsyntax '\e! 'vmacro 'stateabbrev)\fP ! 422: t ! 423: \-> \fI'( ! california ! wyoming ! pennsylvania)\fP ! 424: (ca nil pa) ! 425: .Ee ! 426: Notice what happened to ! 427: \fI ! wyoming\fP. ! 428: Since it wasn't in the table, the associated function ! 429: returned nil. ! 430: The creator of the macro may have wanted to leave the ! 431: list alone, in such a case, but couldn't with this ! 432: type of reader macro. ! 433: The splicing macro, described next, allows a character macro function ! 434: to return a value that is ignored. ! 435: .sh +0 Splicing ! 436: .pp ! 437: The value returned from a splicing macro must be a list or nil. ! 438: If the value is nil, then the value is ignored, otherwise the reader ! 439: acts as if it read each object in the list. ! 440: Usually the list only contains one element. ! 441: If the reader is reading at the top level (i.e. not collecting elements ! 442: of list), ! 443: then it is illegal for a splicing macro to return more then one ! 444: element in the list. ! 445: The major advantage of a splicing macro over a normal macro is the ! 446: ability of the splicing macro to return nothing. ! 447: The comment character (usually ;) is a splicing macro bound to a ! 448: function which reads to the end of the line and always returns nil. ! 449: Here is the previous example written as a splicing macro ! 450: .Eb ! 451: \-> \fI(de\kAfun stateabbrev nil ! 452: \h'|\nAu'(\kC(lam\kBbda (value) ! 453: \h'|\nBu'(cond \kA(value (list value)) ! 454: \h'|\nAu'(t nil))) ! 455: \h'|\nCu'(cdr (assq (read) '((california . ca) (pennsylvania . pa))))))\fP ! 456: \-> \fI(setsyntax '! 'vsplicing-macro 'stateabbrev)\fP ! 457: \-> \fI'(!pennsylvania ! foo !california)\fP ! 458: (pa ca) ! 459: \-> \fI'!foo !bar !pennsylvania\fP ! 460: pa ! 461: \-> ! 462: .Ee ! 463: .sh +0 Infix ! 464: .pp ! 465: Infix macros are passed a ! 466: .i conc ! 467: structure representing what has been read so far. ! 468: Briefly, a ! 469: tconc ! 470: structure is a single list cell whose car points to ! 471: a list and whose cdr points to the last list cell in that list. ! 472: The interpretation by the reader of the value ! 473: returned by an infix macro depends on ! 474: whether the macro is called while the reader is constructing a ! 475: list or whether it is called at the top level of the reader. ! 476: If the macro is called while a list is ! 477: being constructed, then the value returned should be a tconc ! 478: structure. ! 479: The car of that structure replaces the list of elements that the ! 480: reader has been collecting. ! 481: If the macro is called at top level, then it will be passed the ! 482: value nil, and the value it returns should either be nil ! 483: or a tconc structure. ! 484: If the macro returns nil, then the value is ignored and the reader ! 485: continues to read. ! 486: If the macro returns a tconc structure of one element (i.e. whose car ! 487: is a list of one element), then that single element is returned ! 488: as the value of ! 489: .i read. ! 490: If the macro returns a tconc structure of more than one element, ! 491: then that list of elements is returned as the value of read. ! 492: .Eb ! 493: \-> \fI(de\kAfun plusop (x) ! 494: \h'|\nAu'(cond \kB((null x) (tconc nil '\e+)) ! 495: \h'|\nBu'(t (lconc nil (list 'plus (caar x) (read))))))\fP ! 496: ! 497: plusop ! 498: \-> \fI(setsyntax '\e+ 'vinfix-macro 'plusop)\fP ! 499: t ! 500: \-> \fI'(a + b)\fP ! 501: (plus a b) ! 502: \-> \fI'+\fP ! 503: |+| ! 504: \-> ! 505: .Ee ! 506: .sh -1 Invocations ! 507: .pp ! 508: There are three different circumstances in which you would like ! 509: a macro function to be triggered. ! 510: .ip \fIAlways\ -\fP ! 511: Whenever the macro character is seen, the macro should be invoked. ! 512: This is accomplished by using the character classes ! 513: .i cmacro , ! 514: .i csplicing-macro , ! 515: or ! 516: .i cinfix-macro , ! 517: and by using the ! 518: .i separator ! 519: property. ! 520: The syntax classes ! 521: .b vmacro , ! 522: .b vsplicing-macro , ! 523: and ! 524: .b vsingle-macro ! 525: are defined this way. ! 526: .ip \fIWhen\ first\ -\fP ! 527: The macro should only be triggered when the macro character is the first ! 528: character found after the scanning process. ! 529: A syntax class for a ! 530: .i when ! 531: .i first ! 532: macro would ! 533: be defined ! 534: using ! 535: .i cmacro , ! 536: .i csplicing-macro , ! 537: or ! 538: .i cinfix-macro ! 539: and not including the ! 540: .i separator ! 541: property. ! 542: .ip \fIWhen\ unique\ -\fP ! 543: The macro should only be triggered when the macro character is the only ! 544: character collected in the token collection ! 545: phase of the reader, ! 546: i.e the macro character is preceeded by zero or more ! 547: .i cseparator s ! 548: and followed by a ! 549: .i separator. ! 550: A syntax class for a ! 551: .i when ! 552: .i unique ! 553: macro would ! 554: be defined using ! 555: .i csingle-macro , ! 556: .i csingle-splicing-macro , ! 557: or ! 558: .i csingle-infix-macro ! 559: and not including the ! 560: .i separator ! 561: property. ! 562: The syntax classes so defined are ! 563: .b vsingle-macro , ! 564: .b vsingle-splicing-macro , ! 565: and ! 566: .b vsingle-infix-macro . ! 567: .sh -1 Functions ! 568: .Lf setsyntax 's_symbol\ 's_synclass\ ['ls_func] ! 569: .Wh ! 570: ls_func is the name of a function or a lambda body. ! 571: .Re ! 572: t ! 573: .Se ! 574: S_symbol should be a symbol whose print name is only one character. ! 575: The syntax class for ! 576: that character is ! 577: set to s_synclass in the current readtable. ! 578: If s_synclass is a class that requires a character macro, then ! 579: ls_func must be supplied. ! 580: .No ! 581: The symbolic syntax codes are new to Opus 38. ! 582: For compatibility, s_synclass can be one of the fixnum syntax codes ! 583: which appeared in older versions of the ! 584: .Fr ! 585: Manual. ! 586: This compatibility is only temporary: existing code which uses the ! 587: fixnum syntax codes should be converted. ! 588: .Lf getsyntax 's_symbol ! 589: .Re ! 590: the syntax class of the first character ! 591: of s_symbol's print name. ! 592: s_symbol's print name must be exactly one character long. ! 593: .No ! 594: This function is new to Opus 38. ! 595: It supercedes \fI(status\ syntax)\fP which no longer exists. ! 596: .Lf add-syntax-class 's_synclass\ 'l_properties ! 597: .Re ! 598: s_synclass ! 599: .Se ! 600: Defines the syntax class s_synclass to have properties l_properties. ! 601: The list l_properties should contain a character classes mentioned ! 602: above. ! 603: l_properties may contain one of the escape properties: ! 604: .i escape-always , ! 605: .i escape-when-unique , ! 606: or ! 607: .i escape-when-first . ! 608: l_properties may contain the ! 609: .i separator ! 610: property. ! 611: After a syntax class has been defined with ! 612: .i add-syntax-class , ! 613: the ! 614: .i setsyntax ! 615: function can be used to give characters that syntax class. ! 616: .Eb ! 617: ; Define a non-separating macro character. ! 618: ; This type of macro character is used in UCI-Lisp, and ! 619: ; it corresponds to a FIRST MACRO in Interlisp ! 620: \-> \fI(add-syntax-class 'vuci-macro '(cmacro escape-when-first))\fP ! 621: vuci-macro ! 622: \-> ! 623: .Ee
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.