[MLton-devel] nucleic benchmark
Matthew Fluet
fluet@CS.Cornell.EDU
Thu, 7 Nov 2002 11:52:12 -0500 (EST)
> MLton0 -- mlton.cvs.HEAD -native true
> MLton1 -- mlton.cvs.HEAD -native false
> SML/NJ -- SML/NJ
> run time ratio
> benchmark MLton1 SML/NJ
> nucleic 1.23 0.61
I took a really brief look at a time profile for nucleic. As expected,
the lion's share of the time are in floating-point intensive blocks (>20
f.p. primitive ops). I kept the assembly for both the native codegen (.S
files) and the C codegen (.s file). Interestingly, while gcc is identical
in the "real" work, it is significantly better at managing the f.p.
register stack and reducing memory traffic.
[fluet@lennon temp]$ grep "\(fmul\)\|\(fadd\)\|\(fsub\)\|\(fdiv\)" *.S | wc -l
207
[fluet@lennon temp]$ grep "\(fmul\)\|\(fadd\)\|\(fsub\)\|\(fdiv\)" *.s | wc -l
207
[fluet@lennon temp]$ grep "\(fld\)\|\(fst\)" *.S | wc -l
12825
[fluet@lennon temp]$ grep "\(fld\)\|\(fst\)" *.s | wc -l
1007
[fluet@lennon temp]$ grep "\(fxch\)" *.S | wc -l
6612
[fluet@lennon temp]$ grep "\(fxch\)" *.s | wc -l
290
One thing to note is that any move of a floating-point value uses the
floating-point registers; so just copying tuple elements from one tuple to
another will require bouncing through a float reg. Back in Jan. 2002, I
looked into replacing some of those mem-mem moves to use integer
registers; the results went both ways -- nucleic sped up by 8%, mandelbrot
slowed down by 13%, and everything else was pretty minor, so it never
stayed in the code.
-------------------------------------------------------
This sf.net email is sponsored by: See the NEW Palm
Tungsten T handheld. Power & Color in a compact size!
http://ads.sourceforge.net/cgi-bin/redirect.pl?palm0001en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel