Some results
Matthew Fluet
fluet@research.nj.nec.com
Wed, 26 Jul 2000 18:26:49 -0400 (EDT)
> > Last time I started a self-compile from SML/NJ it was up over
> > half an hour before I killed it off.
>
> In what pass was it? It definitely shouldn't take this long to
> generate C. You should make sure the flags are set as in the script
> below or as in the Makefile. For a simple test, from within the src/
> directory, you should be able to do "make nj-mlton && make".
I think it's just the fact that this machine is a 200Mhz PentiumPro. It
took about 45 minutes for nj-mlton to create the mlton.c file.
I went ahead and "inlined" the Thread_switchTo macro into assembly and
eliminated the GC_switchToThread function and Thread_switchTo1 macro.
Trying out the thread-switch.sml program, the inlined version is much
faster (that was a given), but it's not quite as fast as the inlined
version for the c-codegen, either with or without global-pseudo-regs.
It's a little surprising to me. Looking at the assembly for
Thread_switchTo in both the x86-codegen and the c-codegen, the x86 uses
fewer instructions and less memory traffic, as far as I can tell. The
difference might be that gcc hoists some of the loads and speeds up the
pipeline.
Also, looking at an integer-only self-compile:
For the original mlton.c,
grep "\(RD\)\|\(SD\)\|\(Real\)" mlton.c | wc -l ==> 326
After eliminating a call to Time.toString
grep "\(RD\)\|\(SD\)\|\(Real\)" mlton.c | wc -l ==> 86
(And, as a bonus, since Time.toMilliseconds doesn't require floating
point, I've set it up so a verbose trace still prints out the time. The
downside is that this requires a call to IntInf.toString, but I can handle
that.)
The remaining operations on reals seem to be creeping in from the
compiler's basis library files. For example, there is one call to
Real_Math_ln in the mlton.c file. As best I can make out, it's
originating in library/basic/real.sml and the line
val ln2 = ln two
Now, for some reason, this isn't being uselessed away, even though the
only use of ln2 is in library/basic/real.sml
fun log2 x = ln x / ln2
and I can't find any call to log2 anywhere in mlton.sml.
Since eliminating the Real.fmt call that was hiding behind the
Time.toString got rid of about 3/4 of the floating point operations, I'm
guessing that there isn't any single function that is responsible for the
remaining fp ops.