|
|
1.1 ! root 1: \chapter{Lexical analysis} ! 2: \section{Reserved words} ! 3: The following are reserved words. They may not be used as ! 4: identifiers. In this document the alphabetic reserved words are always ! 5: shown in boldface. ! 6: \begin{quote} ! 7: \raggedright ! 8: \tt ! 9: abstraction abstype and andalso ! 10: as case datatype else end exception do fn fun functor handle if in ! 11: infix infixr let local nonfix of op open overload ! 12: raise rec sharing sig signature ! 13: struct structure then type val while with withtype orelse ! 14: ! 15: \verb"{ } [ ] , ; ( ) -> * | : ... = => # _" ! 16: \end{quote} ! 17: \section{Special constants} ! 18: An integer constant is any non-empty sequence of digits, possibly preceded ! 19: by a negation symbol (\verb|~|). ! 20: ! 21: A real constant is an integer constant, possibly followed by a point (.) ! 22: and one or more digits, possibly followed by an exponent symbol(E) and ! 23: an integer constant; at least one of the optional parts must occur, ! 24: hence no integer constant is a real constant. Examples: \verb|0.7| , ! 25: \verb|~3.32E5| , \verb|3E~7| . Non-examples: \verb|23| , \verb|.3| , ! 26: \verb|4.E5| , \verb|1E2.0| . ! 27: ! 28: A string constant is a sequence, between quotes (\verb|"|), of zero or more ! 29: printable characters, spaces, or escape sequences. Each escape sequence ! 30: is introduced by the escape character \verb|\|, and stands for a character ! 31: sequence. The allowed escape sequences are as follows (all other ! 32: uses of \verb|\| being incorrect): ! 33: \begin{tabular}{l p{3.9in}} ! 34: \verb|\n| & A single character interpreted by the system as end-of-line.\\ ! 35: \verb|\t| & Tab. \\ ! 36: \verb|\^c| & The control character c, for any appropriate c.\\ ! 37: \verb|\ddd| & The single character with ASCII code ddd (3 decimal digits).\\ ! 38: \verb|\"| & The double-quote character (\verb'"'). \\ ! 39: \verb|\\| & The backslash character (\verb"\").\\ ! 40: \verb|\f___f\| & This sequence is ignored, where f\_\_\_f stands for a ! 41: sequence of one or more formatting characters (a subset of the ! 42: non-printable characters including at least space, tab, newline, ! 43: formfeed). This allows one to write long strings on more than one ! 44: line, by writing \verb"\" at the end of one line and at the start of the ! 45: next. ! 46: \end{tabular} ! 47: ! 48: \section{Identifiers} ! 49: ! 50: An identifier is either {\em alphanumeric}: any sequence of letters, ! 51: digits, primes (\verb"'"), and underbars (\verb"_") starting with a letter or a ! 52: prime, or {\em symbolic}: any sequence of the following symbols ! 53: \begin{quote} ! 54: \verb"! % & $ + - / : < = > ? @ \ ~ \^ | # * `" ! 55: \end{quote} ! 56: In either case, however, reserved words are excluded. This means ! 57: that for example \verb"_" and \verb"|" are not identifiers, but ! 58: \verb"also_ran" and \verb"|=|" are identifiers. ! 59: ! 60: Identifiers are used to stand for 9 different classes of objects, ! 61: which occupy 6 different name spaces, as follows: ! 62: \begin{enumerate} ! 63: \item value variables ({\it var}), value constructors ({\it con}), \\ ! 64: exception constructors ({\it exncon}) ! 65: \item type variables ({\it tyvar}) ! 66: \item type constructors ({\it tycon}) ! 67: \item record labels ({\it lab}) ! 68: \item structures ({\it str}), functors ({\it fct}) ! 69: \item signatures ({\it sgn}) ! 70: \end{enumerate} ! 71: Thus, an identifier could not in the same scope stand for both a ! 72: value variable and a constructor, but an identifier can ! 73: be bound simultaneously to a type constructor and a signature. ! 74: ! 75: To remove some ambiguity, it is recommended that constructors start ! 76: with an uppercase letter, and variables start with a lowercase ! 77: letter; but this is a convention, not an enforced rule (it is ! 78: confounded, for example, by symbolic identifiers). ! 79: ! 80: A type variable ({\it tyvar}) may be any alphanumeric identifier starting ! 81: with a prime. The other eight classes ({\it var, con, tycon, ...}) ! 82: are represented by identifiers not starting with a prime. The class ! 83: lab is also extended to include the numeric labels 1, 2, 3, ... . ! 84: ! 85: Type variables are therefore disjoint from the other classes. ! 86: Otherwise, the class of an occurrence of an identifier is determined ! 87: from context. ! 88: ! 89: Spaces or parentheses are sometimes needed ! 90: to separate symbolic identifiers and reserved words. Two examples are ! 91: ! 92: \begin{tabular}{c c c c c} ! 93: \verb"a:= !b" &or& \verb"a:=(!b)" &but not& \verb"a:=!b"\\ ! 94: \verb"~ :int->int" &or& \verb"(~):int->int" &but not& \verb"~:int->int" ! 95: \end{tabular} ! 96: ! 97: These punctuation characters cannot be constituents of identifiers ! 98: and therefore never need spaces around them: ! 99: \begin{quotation} ! 100: \verb| " ( ) , . ; [ ] { } | ! 101: \end{quotation} ! 102: ! 103: \section{Comments} ! 104: A comment is a character sequence (outside of a string) ! 105: within comment brackets (* *) in which comment brackets are properly ! 106: nested. ! 107: ! 108: \section{The bare syntax} ! 109: The Standard ML bare language is obtained by stripping the full ! 110: language of any {\em derived} forms (those that may be defined in ! 111: terms of other constructs in the language), and of any constructs ! 112: related to the module system. The bare language will be explained ! 113: in Chapters \ref{eval} and \ref{types}, ! 114: and successive chapters describe augmentations ! 115: of it that yield the full language. ! 116: ! 117: Figure~\ref{bare} shows the syntax of the bare language. The notation ! 118: \begin{quotation} ! 119: phrase x \rep{k} x phrase ! 120: \end{quotation} ! 121: indicates the repetition of the {\em phrase} at least $k$ times, ! 122: separated by the punctuation character $x$.
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.