MLton's closure conversion

Tue, 10 Oct 2000 15:57:01 -0700 (PDT)

> Interesting.  Note, one thing we can do is to delay any selection (or some of
> the selections) until you get to a split in the control flow such  that  some
> of  the  environment  is  dead  in one of the branches.  Since everything was
> flattened out already in the F2 example (non-curried case) this wouldn't help
> there,  I  could  imagine big improvements otherwise.  I.e., in one branch of
> that all the environment is dead, and in the other all of it is  alive:  thus
> you  would  just  delay  all selects (eliminating them from the fast case and
> delaying further in the slow case).

I sort of see what you're saying.  Now that I think about it, it would be nice
if we could keep the closure converter as is, and express what we want as a CPS
to CPS transformation -- something that delays selects as long as possible.

> The general problem of currying also is interesting.  We tend not to use  it,
> so  it  doesn't  show  up  very much in our code.  We could transform curried
> functions into curried calls to uncurried ones.  I.e., replace
> 
>     fun f a b c =
>            if a > 0
>               then f (b + c) x (f y z w)
>            else 17
> 
> to
>     fun f' (a, b, c) =
>            if a > 0
>               then f' (b + c, x, f' (y, z, w))
>               else 17
> 
>     fun f a b c = f' (a, b, c)
> 
> and then use the inliner to convert the f calls to f' in the case that it  is
> completely called.

Yeah, I guess we really should put an uncurrier in.  After all, it can't hurt,
can it?  We know it doesn't help much on the standard benchmarks since they're
not very curried.  But it certainly would have helped on that ported OCAML
code.  Here's some old mail from Suresh about an uncurrier he wrote.  Suresh,
could you send me the latest version of that uncurrier, and I'll look into
cleaning it up and putting it in.

> From: Suresh Jagannathan <suresh@research.nj.nec.com>
> To: MLton@research.nj.nec.com
> Subject: preliminary times w/uncurrier 
> Date: Mon, 22 Nov 1999 20:37:17 -0500
> 
> 
> Here are times for representative small benchmarks gathered running on my local
> machine.  Basically, (except for life) uncurrying doesn't buy a whole lot.  The
> implementation I'm using is what Steven originally suggested: a bottom-up walk
> of the program replacing fully applied curried calls with their uncurried
> equivalent.  Quickly glancing at the programs (in their uncurried and curried
> forms) reveals a number of places where curried functions are indeed replaced,
> but I suspect that by and large the benefit of replacing a function call or two
> is partially masked by the need to cons a new tuple.  Of course, I haven't been
> able to get the kit to run yet.  That's my next step, but unless something very
> unusual is taking place there, these numbers imply we're unlikely to see any
> significant improvement.
> 
> 
> Benchmark      Time (uncurried)  Time (curried)
> ===============================================
> barnes-hut         17.34              17.72
> count-graphs       19.37              20.33
> knuth-bendix       22.18              22.88
> lexgen             47.60              47.02
> life               73.02              79.17
> logic              79.17              78.12
> mandelbrot         19.20              19.12
> matrix-multiply    18.21              18.21
> nucleic            33.72              33.51
> simple             27.34              28.47
> tsp                38.46              38.41
> zern               67.08              67.47