|
|
1.1 root 1: How to Standard ML of New Jersey run faster
2:
3:
4: 1. Each compilation unit is compiled separately. None of the
5: optimizations take place across compilation-unit boundaries.
6: Example:
7:
8: fun f(x) = (x,x);
9: fun g 0 = nil | g i = f i :: g(i-1);
10:
11: This is two compilation units if typed at top level, or if loaded
12: from a file because at the first semicolon, the function f is compiled,
13: and then at the next semicolon, g is compiled. The function g will run
14: significantly faster if any of the following is used instead:
15:
16: fun f(x) = (x,x)
17: fun g 0 = nil | g i = f i :: g(i-1);
18:
19: local fun f(x) = (x,x);
20: in fun g 0 = nil | g i = f i :: g(i-1)
21: end;
22:
23: structure S = struct
24: fun f(x) = (x,x);
25: fun g 0 = nil | g i = f i :: g(i-1);
26: end;
27:
28: In either of these last two, of course, the semicolons are optional.
29:
30: Moral of the story: use small compilation units while typing to
31: the interactive system and seeing how things work. Use larger
32: compilation units when compiling large programs. I recommend the
33: use of the module system, or of "let" and "local" declarations,
34: to bind things together in a well-structured way.
35:
36: The use of signature constraints to minimize the number of things
37: exported from structures will reduce memory usage, and is just clean style.
38:
39: 2. For the fanatic: (these are not guaranteed forever)
40:
41: The initial environment (i.e. the List, Array, Ref, etc. structures)
42: is normally in a separate module from the user program. If you
43: would like a copy of this stuff in your program so that calls to the
44: pervasive functions will have less overhead, textually insert
45: src/boot/fastlib.sml near the beginning of your own structure.
46: This only helps, of course, if fastlib.sml is put into the same
47: compilation unit as the functions calling it, using the module
48: system as described above.
49:
50: You can nest structures. To get better performance, after you
51: have developed your program, nest the whole thing in one huge
52: structure, e.g.
53:
54: structure Whole : sig end = struct
55:
56: your program
57: end
58:
59: You can even put signatures and functors at top level inside such a
60: structure, although this is not "Standard" ML.
61:
62:
63: 3. You can increase the level of optimization, if you want to wait
64: a bit longer for compiles. To make things compile more slowly
65: but run faster, execute this before compiling your program:
66:
67: System.Control.CG.reducemore := 0;
68: System.Control.CG.rounds := 10;
69: System.Control.CG.bodysize := 20;
70:
71: To make things compile faster but run slower, try this:
72:
73: System.Control.CG.reducemore := 10000;
74: System.Control.CG.rounds := 0;
75: System.Control.CG.bodysize := ~100;
76: System.Control.CG.reduce := false;
77:
78: 4. You can measure the execution time of your programs using the
79: functions in System.Timer.
80: (* in the initial environment,
81: signature TIMER =
82: sig
83: datatype time = TIME of {sec : int, usec : int}
84: type timer
85: val start_timer : unit -> timer
86: val check_timer : timer -> time
87: val check_timer_gc: timer -> time
88: val makestring : time -> string
89: val add_time : time * time -> time
90: end
91: structure System.Timer : TIMER
92: *)
93:
94: let val t = System.Timer.start_timer()
95: val _ = run_my_program()
96: val non_gc_time = System.Timer.check_timer t
97: val gc_time = System.Timer.check_timer_gc t
98: val total_time = System.Timer.add_time(non_gc_time,gc_time)
99: in print(System.Timer.makestring total_time)
100: end
101:
102: 5. You can also use the execution profiler, described in doc/profiling
This archive runs on limited infrastructure. Preserving old code on modern bandwidth. Automated agents are requested to crawl responsibly.