Annotation of researchv10no/cmd/sml/doc/refman/lex.tex, revision 1.1

1.1     ! root        1: \chapter{Lexical analysis}
        !             2: \section{Reserved words}
        !             3: The following are reserved words.   They may not be used as
        !             4: identifiers.  In this document the alphabetic reserved words are always
        !             5: shown in boldface.
        !             6: \begin{quote}
        !             7: \raggedright
        !             8: \tt
        !             9: abstraction abstype and andalso
        !            10: as case datatype else end exception do fn fun functor handle if in 
        !            11: infix infixr let local nonfix of op open overload 
        !            12: raise rec sharing sig signature
        !            13: struct structure then type val while with withtype orelse
        !            14: 
        !            15: \verb"{  }  [  ]  ,  ;  (  )  ->  *  |  :  ...  =  =>  #  _"
        !            16: \end{quote}
        !            17: \section{Special constants}
        !            18: An integer constant is any non-empty sequence of digits, possibly preceded
        !            19: by a negation symbol (\verb|~|).
        !            20: 
        !            21: A real constant is an integer constant, possibly followed by a point (.)
        !            22: and one or more digits, possibly followed by an exponent symbol(E) and
        !            23: an integer constant; at least one of the optional parts must occur,
        !            24: hence no integer constant is a real constant.  Examples: \verb|0.7| ,
        !            25: \verb|~3.32E5| , \verb|3E~7| .  Non-examples: \verb|23| , \verb|.3| ,
        !            26: \verb|4.E5| , \verb|1E2.0| .
        !            27: 
        !            28: A string constant is a sequence, between quotes (\verb|"|), of zero or more
        !            29: printable characters, spaces, or escape sequences.  Each escape sequence
        !            30: is introduced by the escape character \verb|\|, and stands for a character
        !            31: sequence.  The allowed escape sequences are as follows (all other
        !            32: uses of \verb|\| being incorrect):
        !            33: \begin{tabular}{l p{3.9in}}
        !            34: \verb|\n| & A single character interpreted by the system as end-of-line.\\
        !            35: \verb|\t| & Tab. \\
        !            36: \verb|\^c| & The control character c, for any appropriate c.\\
        !            37: \verb|\ddd| &  The single character with ASCII code ddd (3 decimal digits).\\
        !            38: \verb|\"| & The double-quote character (\verb'"'). \\
        !            39: \verb|\\| &  The backslash character (\verb"\").\\
        !            40: \verb|\f___f\| & This sequence is ignored, where f\_\_\_f stands for a
        !            41: sequence of one or more formatting characters (a subset of the
        !            42: non-printable characters including at least space, tab, newline,
        !            43: formfeed).  This allows one to write long strings on more than one
        !            44: line, by writing \verb"\" at the end of one line and at the start of the
        !            45: next.
        !            46: \end{tabular}
        !            47: 
        !            48: \section{Identifiers}
        !            49: 
        !            50: An identifier is either {\em alphanumeric}: any sequence of letters,
        !            51: digits, primes (\verb"'"), and underbars (\verb"_") starting with a letter or a
        !            52: prime, or {\em symbolic}: any sequence of the following symbols
        !            53: \begin{quote}
        !            54: \verb"! % & $ + - / : < = > ? @ \ ~ \^ | # * `"
        !            55: \end{quote}
        !            56: In either case, however, reserved words are excluded.  This means
        !            57: that for example \verb"_" and \verb"|" are not identifiers, but
        !            58: \verb"also_ran" and \verb"|=|" are identifiers.
        !            59: 
        !            60: Identifiers are used to stand for 9 different classes of objects,
        !            61: which occupy 6 different name spaces, as follows:
        !            62: \begin{enumerate}
        !            63: \item value variables ({\it var}), value constructors ({\it con}), \\
        !            64: exception constructors ({\it exncon})
        !            65: \item type variables ({\it tyvar})
        !            66: \item type constructors ({\it tycon})
        !            67: \item record labels ({\it lab})
        !            68: \item structures ({\it str}), functors ({\it fct})
        !            69: \item signatures ({\it sgn})
        !            70: \end{enumerate}
        !            71: Thus, an identifier could not in the same scope stand for both a
        !            72: value variable and a constructor, but an identifier can
        !            73: be bound simultaneously to a type constructor and a signature.
        !            74: 
        !            75: To remove some ambiguity, it is recommended that constructors start
        !            76: with an uppercase letter, and variables start with a lowercase
        !            77: letter; but this is a convention, not an enforced rule  (it is
        !            78: confounded, for example, by symbolic identifiers).
        !            79: 
        !            80: A type variable ({\it tyvar}) may be any alphanumeric identifier starting
        !            81: with a prime.  The other eight classes ({\it var, con, tycon, ...})
        !            82: are represented by identifiers not starting with a prime.  The class
        !            83: lab is also extended to include the numeric labels 1, 2, 3, ... .
        !            84: 
        !            85: Type variables are therefore disjoint from the other classes.
        !            86: Otherwise, the class of an occurrence of an identifier is determined
        !            87: from context.
        !            88: 
        !            89: Spaces or parentheses are sometimes needed 
        !            90: to separate symbolic identifiers and reserved words.  Two examples are
        !            91: 
        !            92: \begin{tabular}{c c c c c}
        !            93: \verb"a:= !b" &or& \verb"a:=(!b)" &but not& \verb"a:=!b"\\
        !            94: \verb"~ :int->int" &or& \verb"(~):int->int" &but not& \verb"~:int->int"
        !            95: \end{tabular}
        !            96: 
        !            97: These punctuation characters cannot be constituents of identifiers
        !            98: and therefore never need spaces around them:
        !            99: \begin{quotation}
        !           100: \verb| " ( ) , . ; [ ] { } |
        !           101: \end{quotation}
        !           102: 
        !           103: \section{Comments}
        !           104: A comment is a character sequence (outside of a string)
        !           105: within comment brackets (* *) in which comment brackets are properly
        !           106: nested.
        !           107: 
        !           108: \section{The bare syntax}
        !           109: The Standard ML bare language is obtained by stripping the full
        !           110: language of any {\em derived} forms (those that may be defined in
        !           111: terms of other constructs in the language), and of any constructs
        !           112: related to the module system.  The bare language will be explained
        !           113: in Chapters \ref{eval} and \ref{types},
        !           114: and successive chapters describe augmentations
        !           115: of it that yield the full language.
        !           116: 
        !           117: Figure~\ref{bare} shows the syntax of the bare language.  The notation
        !           118: \begin{quotation}
        !           119: phrase x \rep{k} x phrase
        !           120: \end{quotation}
        !           121: indicates the repetition of the {\em phrase} at least $k$  times,
        !           122: separated by the punctuation character $x$.

unix.superglobalmegacorp.com

This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.