[MLton] ref flattening

Stephen Weeks MLton@mlton.org
Wed, 7 Jul 2004 14:53:13 -0700

> Back a long time ago when I was hacking on the code for the programming 
> language shootout. I turned a very ref heavy version of the statistical 
> moments benchmark into a purely functional version. I was hoping to improve 
> the performance by letting the compiler place record unboxed into the 
> machine registers and avoid the overhead of refs. I remember getting a small 
> enough improvement to submit the change.

One reason you may have only seen a small improvement is that as it
stands right now, moments is mostly a test of Real.fromString.  If you
profile it, you will see that well over half the time is spent there.
Real.fromString is difficult to implement quickly in SML because of
the requirements for getting good-to-the last bit conversions and
getting all the corner cases right.  In MLton, the Real.fromString is
implemented using IEEEReal.scan, which adds another layer of overhead
because of per-digit allocation of data structures (lists of digits
and the like).

I have sped up the implementation of Real.fromString in our CVS, but
there's still a lot of room for improvement.  I'm not willing to go to
great lengths, though, and in particular am not willing to duplicate
the logic of IEEEReal.scan just to save some allocation, until we get
some better motivation.  With the speedups I have put in, the moments
benchmark as compiled by the latest MLton runs in about 2/3 the time
as compiled by the latest release.  But Real.fromString is still at
about 1/2 the total time.  Hopefully, the benchmark will be changed at
some point to focus on real computations and not Real.fromString.

> Now with the ref flattening code. I'm wondering if it's better to go back to 
> the old imperative looking version of statical moments that looks more like 
> the ocaml code.

I don't think ref flattening would affect the imperative versionm, but
if you have the old SML code I'd be willing to try it.  Ref flattening
only applies to refs in data structures.  If refs aren't in data
structures, which is the case for the refs in the OCaml version, then
we have other optimizations (globalization and local ref) that have
been around for a while that will handle those.

> Anyway, might be an interesting micro-benchmark to add to the current suite.
> BTW perhaps we can convince the new benchmark maintainers to include an md5 
> benchmark. :)

Brent listens to this list.  If you want to speak more loudly, send to