profiling
Matthew Fluet
Matthew Fluet <fluet@CS.Cornell.EDU>
Mon, 14 Jan 2002 18:12:07 -0500 (EST)
I checked in a small extension to the MLton structure, which adds a
structure Profile with the signature:
signature MLTON_PROFILE =
sig
(* a compile-time constant, like MLton.debug *)
val profile: bool
(* reset all the bin counters to 0 *)
val reset: unit -> unit
(* write out all of the bins to a mlmon.out format file,
* with the given file name
*)
val write: string -> unit
end
Now, I can write the following program:
val rec fib =
fn 0 => 0
| 1 => 1
| n => fib (n - 1) + fib (n - 2)
val rec f =
fn 0 => ()
| n => (fib 38; f (n-1))
fun tak (x,y,z) =
if not (y < x)
then z
else tak (tak (x - 1, y, z),
tak (y - 1, z, x),
tak (z - 1, x, y))
val rec g =
fn 0 => ()
| n => (tak (18,12,6); g (n-1))
val _ = MLton.Profile.write "mlmon.init.out"
val _ = MLton.Profile.reset ()
val _ = f 2
val _ = MLton.Profile.write "mlmon.fib.out"
val _ = MLton.Profile.reset ()
val _ = g 500
val _ = MLton.Profile.write "mlmon.tak.out"
val _ = MLton.Profile.reset ()
Compile it with -p (the implementations of MLton.Profile.{reset,write} are
predicated by MLton.Profile.profile, so when compiling without -p,
MLton.Profile.{reset,write} are nops), and run it. Then I can do the
following:
[fluet@lennon w]$ mlprof -x -d 1 w mlmon.init.out
0.0 seconds of CPU time
[fluet@lennon w]$ mlprof -x -d 1 -t 5 w mlmon.fib.out
14.80 seconds of CPU time
fib_0 100.00% (14.80s)
L_83 24.93% (3.69s)
L_85 20.00% (2.96s)
L_78 15.00% (2.22s)
fib_0 9.86% (1.46s)
L_80 7.09% (1.05s)
L_87 6.35% (0.94s)
L_86 5.54% (0.82s)
L_79 5.27% (0.78s)
[fluet@lennon w]$ mlprof -x -d 1 -t 5 w mlmon.tak.out
1.92 seconds of CPU time
tak_0 100.00% (1.92s)
L_75 12.50% (0.24s)
L_73 12.50% (0.24s)
L_71 11.98% (0.23s)
L_69 11.46% (0.22s)
loop_6 10.94% (0.21s)
L_68 10.42% (0.20s)
tak_0 9.38% (0.18s)
L_74 6.77% (0.13s)
L_76 5.73% (0.11s)
L_77 5.21% (0.10s)
[fluet@lennon w]$ mlprof -x -d 1 -t 5 w mlmon.fib.out mlmon.tak.out
16.72 seconds of CPU time
fib_0 88.52% (14.80s)
L_83 24.93% (3.69s)
L_85 20.00% (2.96s)
L_78 15.00% (2.22s)
fib_0 9.86% (1.46s)
L_80 7.09% (1.05s)
L_87 6.35% (0.94s)
L_86 5.54% (0.82s)
L_79 5.27% (0.78s)
tak_0 11.48% (1.92s)
L_75 12.50% (0.24s)
L_73 12.50% (0.24s)
L_71 11.98% (0.23s)
L_69 11.46% (0.22s)
loop_6 10.94% (0.21s)
L_68 10.42% (0.20s)
tak_0 9.38% (0.18s)
L_74 6.77% (0.13s)
L_76 5.73% (0.11s)
L_77 5.21% (0.10s)
Even though in this example the two profiling data are disjoint, there is
no problem combining mlmon files that include counts at the same address.
This is a prequel to getting specific profiling information on a self
compile to figure out why knownCase is > 6s on a self-compile.