|
|
1.1 root 1: The following note by Andrew Appel points out a number of serious problems
2: with the standard environment as defined in "Definition of Standard ML,
3: Version 2", and proposes changes that would correct these problems. These
4: problems with the standard environment have come to our attention through
5: using and implementing the language. Unfortunately, the environment
6: described in the Definition was (in our opinion, prematurely) frozen before
7: the proposal was tested against "reality", but we hope that there is still
8: enough flexibility within the ML community to allow us to make corrections to
9: deal with these problems. We hope that this note will bring these issues to
10: the attention of the community. [DBM]
11:
12:
13: PROPOSED CHANGES
14:
15: Andrew Appel proposes the following changes to the initial static environment
16: in the "official" Definition of Standard ML. These have not all been
17: implemented, but we would like to implement them if no one objects.
18:
19: 1. input
20:
21: input(f,n) returns a string of length k <= n
22:
23: Current semantics: if k<n then end-of-stream has been reached.
24: New semantics:
25: If no characters available, input blocks (as it does now).
26: If j>0 is the number of characters available, then k = min(j,n).
27: If (and only if) end of stream is reached, the empty string is returned.
28:
29: The reason I suggest this change is that:
30:
31: A. this primitive is much closer to the one provided by the operating system
32: B. the old primitive can be defined in terms of this one, but not the reverse
33: C. users often need this primitive and not the old one
34:
35: The old definition of input can be implemented using the new definition:
36:
37: fun old_input(f,n) =
38: case input(f,n)
39: of "" => ""
40: | s => if size(s) < n then s ^ old_input(f,n-size(s)) else s
41:
42:
43: 2. Inequality operators for strings
44:
45: The operators < <= > >= should be supported for strings. Even though
46: they could be defined by users using the existing string primitives, it
47: would be impossible for ordinary users to get them overloaded properly
48: with the integer and real comparison operators. There are many functions
49: not in the standard that I would like in my environment, but I can simply
50: define them; I am unable to define new overloadings, however.
51:
52:
53: 3. Arithmetic exceptions
54:
55: The current set of arithmetic exceptions is unrealistic because it does
56: not correspond with what is available in the hardware. To correctly
57: implement the current standard would add a very substantial overhead
58: to each execution of an arithmetic operator. Therefore, I propose:
59:
60: exception Div for integer div and mod with a dividend of 0
61: exception Overflow for all integer operators with an out-of-range result
62:
63: exception Real of string for all real operators with an out-of-range or
64: other error result
65:
66: for floating point division by 0, I propose the use of the Div exception,
67: although a separate exception could be defined for this if anyone cares.
68:
69: The exceptions Floor and Sqrt and Exp and Ln can be left as is.
70:
71:
72: 4. div and mod
73:
74: The current language definition has perfectly reasonable definitions of
75: div and mod. The problem is that no machine supports these definitions;
76: all that is supported is an integer divide (which I'll call div' )
77: that always rounds towards zero. From this it is easy to synthesize modulus:
78:
79: fun a mod' b = a - b * (a div' b)
80:
81: These div and mod operators are not the ones in the language definition.
82: However, it is possible to implement div and mod as defined in the standard:
83:
84: fun a div b = if a<0 then ... else if b<0 then ... else a div' b
85: fun a mod b = a - b * (a div b)
86:
87: This is, of course, very slow. However, any compiler that implements
88: these directly will have to generate exactly the same (slow) code!
89: The poor user who just wants to do div and mod on positive integers will pay
90: a penalty. And consider the hard-luck case who actually needs rounding
91: towards zero; he would have to implement div'' and mod'' as:
92:
93: fun a div'' b = if a<0 then ... else if b<0 then ... else a div b
94:
95: That is, he'll have to insert tests that undo the test that the
96: standard-library functions are doing; no wonder he'll think that functional
97: languages are slow.
98:
99: Therefore, I propose:
100:
101: div always rounds towards zero
102: mod is just fun a mod b = a - b * (a div b)
103:
104: If a user wants the current versions of div and mod, he can implement them;
105: and his implementation will be no worse in performance
106: than what happens now in the standard library functions!
107:
108: And remember, I don't make this proposal because it's more elegant than
109: the current definition, just because it's what the machines actually do,
110: and users can easily synthesize the functions they really want from
111: the primitives.
112:
113:
114: 5. Interrupt
115:
116: Consider the following:
117:
118: fun f() = (process(input(std_in, 10)); f()) handle Interrupt => f()
119:
120: This is intended to be an Interrupt-proof loop (just like the "toplevel"
121: of an interactive ML system). However, if two interrupts arrive in very
122: close succession, then the second will arrive in the exception handler
123: and will cause execution to terminate. The only safe way to handle this
124: is to disable the interrupt button as soon as an Interrupt arrives,
125: with an explicit re-enabling of interrupts at the discretion of the program.
126: This requires the function:
127:
128: enable_interrupt : unit -> unit
129:
130: with (just for symmetry) a corresponding
131:
132: disable_interrupt : unit -> unit
133:
134: Then the function f() above can be written as
135:
136: fun f() = (enable_interrupt();
137: process(input(std_in, 10)); f()) handle Interrupt => f()
138:
139:
140: 6. Arrays
141:
142: I get tired of people telling me "ML doesn't have arrays, so I can't do X".
143: Then I have to explain that every ML compiler has arrays, even though
144: the language definition doesn't. Perhaps it would be simpler to put them
145: in the language definition.
146:
147:
148: 7. Io exception
149:
150: The current Io exception carries a string approximately of the form
151: "Cannot open s" where s is a filename. This is objectionable for two reasons.
152:
153: First, it's not possible to pattern-match on substrings; if the strings
154: are to be standardized, a datatype should be used.
155:
156: Second, there's no indication of why a file cannot be opened (or written,
157: read, etc.). Most operating systems are perfectly happy to provide a string
158: explaining what failed, e.g. "No such file or directory" or
159: "Interrupted system call". Therefore, I propose something like the following:
160:
161: exception Io of {operation : string, filename : string, reason : string}
162:
163: where operation is one of "open_in", "input", etc., filename is the
164: name of the stream (given to open_in or open_out) and reason is an
165: operating-system dependent explanation of what happened.
166:
167: Now it's much easier to pattern-match on Io failures, and the reasons for
168: failures are explained.
169:
170:
171: 8. curried input and output
172:
173: All of the first 7 proposals are in some sense fundamental; there's no way
174: that a user can get the right effect by defining functions in his own
175: environment. Proposal #8 is just cosmetic: we propose that the
176: input and output functions be curried. We circulated this proposal about
177: a year ago and got no response. Is anyone listening out there?
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.