[MLton] Papers for improved closure representation?

Stephen Weeks MLton@mlton.org
Sun, 19 Oct 2003 11:39:25 -0700


> Do you have any papers in mind for the optimization suggestion named
> on the projects page?

No.  It was actually based an idea I had a long time ago.  I sent an
email to the MLton list back in October 1999 describing it.  I've
appended the thread.

> I know the usual way of representing a closure as a heap-object
> containing escaping variables and a function pointer. I also know it
> might be better to pass the heap-object and function pointer in 2
> registers instead of passing a single pointer directly to the object
> (but the details when considering a low-register CPU gets murky for
> me). But from there I know nothing.

The idea is at a much higher level and has to do with how the free
variables are represented, especially when they are bound to other
known lambdas.  The idea is suitable for implementation in the current
closure conversion pass that maps a monomorphic higher-order IL
(called SXML in MLton) into a monomorphic first-order IL (called SSA
in MLton).

Here's the old mail thread.  At the time, the monomorphic first-order
IL was based on CPS, not SSA.  As you can see, the idea never made it
to the front of the queue.  But it still seems reasonable and not too
hard to implement.  Although 1 day is no doubt optimistic, especially
for someone not as familiar with the compiler as me.

--------------------------------------------------------------------------------

From: "Stephen Weeks" <sweeks@intertrust.com>
To: MLton@research.nj.nec.com
Date: Fri, 15 Oct 1999 16:46:58 -0700 (PDT)
Subject: flatter closure conversion

I have a new (I think) closure conversion algorithm that I would like
to propose.  It doesn't have anything to do with the control-flow
aspects of closure conversion -- it only has to do with the
environment record representation.  I claim that it produces no larger
closure representations than the usual flat closure conversion, and
often smaller representations.  The idea is in closures that are
closed over other closures, to flatten out entirely the inner closure
within the outer one (and transitively any closures it is closed over, 
etc.).

The effect can be achieved by a slight modification of the usual free
variable function that treats lambdas bound in lexically enclosing
scopes "transparently".

Consider the following language:

e ::= x | fn x => e | e e | let x = e in e end

In the following, assume that all bound variables in a program are
distinct.

Here is the usual free variable function.

FV: Exp -> P(Var)
FV(x) = {x}
FV(fn x => e) = FV(e) - {x}
FV(e1 e2) = FV(e1) U FV(e2)
FV(let x = e1 in e2 end) = FV(e1) U (FV(e) - {x})

Here is my new "closure variables" function.  The intuition is that
the closure variables of lambda is the set of variables that will be
stored in the closure record for the lambda.

CV: Exp x (Var -> P(Var)) -> P(Var)
CV(x, m) = m(x) if x in dom(m)
           {x}  otherwise
CV(fn x => e, m) = CV(e, m) - {x}
CV(e1 e2, m) = CV(e1, m) U CV(e2, m)
CV(let x = e1 in e2, m) = V U (CV(e2, m') - {x})
                          where V = CV(e1, m)
                                m' = if e1 is fn y => e1'
                                     then m[x -> V]
                                     else m

Let CV(fn x => e) denote the set of closure variables for the lambda
as computed above.  Also let T(fn x => e) denote (... yi, ...), where
{... yi, ...} is CV(fn x => e). 

Now, the closure conversion of fn x => e is just T(fn x => e)
(possibly with a code pointer, but I'm ignoring those).  The code
built for fn x => e is

fun f(r, x) =
	let ...
		yi = #i r
	    ...
            ...
	    fi = T(fn xi => ei)
	    ...
	in <closure conversion of e>
	end

where the fi's are the lambdas that are in an enclosing scope of e and 
fi occurs in e.  I.e. the input program looks like
	let fi = fn xi => ei
	in ... (fn x => e)
	end

Here is an example.

val f = fn _ => ... a ... b ...
val g = fn _ => ... b ...
val h = fn _ => ... f ... a ... g ... c ...
val i = fn _ => ... h ... d ...

The closure conversion is

val f = (a, b)
val g = (b)
val h = (a, b, c)
val i = (a, b, c, d)

fun F((a, b), _) = ... a ... b ...
fun G((b), _) = ... b ...
fun H((a, b, c), _) =	let val f = (a, b)
			    val g = (b)
			in ... f ... a ... g ... c ...
			end
fun I((a, b, c, d), _) = let val h = (a, b, c)
			 in ... h ... d ...
			 end

I remind you that traditional "flat" closure conversion would do
something like the following.

val f = (a, b)
val g = (b)
val h = (a, c, f, g)
val i = (h, d)

If you look at box and pointer diagrams of these "flat" closures, it
hopefully becomes clear why I call my version flatter.

In fact one can imagine a whole family of strategies ranging from
traditional flat closure conversion to the flatter variant I mention
here, depending on the level of transparency of lambdas.

I don't think this has any problems with space safety (in my usual
weakened sense where I get to pick the constant blowup per program).
Unlike the infamous Shao closure conversion paper, I'm not doing any
sharing of closures, so I don't have to do any lifetime analysis.

While it may appear that this approach will do significantly more
allocation than flat closure conversion, I believe that subsequent Cps
optimization (esp flattening) will eliminate most of the allocation.

Comments are appreciated.  If it sounds sensible, I may move it near
the front of my MLton queue, because I think it sounds pretty easy to
implement (1 day).

--------------------------------------------------------------------------------

From: Henry Cejtin <henry@clairv.com>
To: MLton@research.nj.nec.com, sweeks@intertrust.com
Date: Fri, 15 Oct 1999 19:08:52 -0500
Subject: Re: flatter closure conversion

I  assume  that in the `usual free variable function' you meant the last rule
to be
        FV(let x = e1 in e2 end) = FV(e1) U (FV(e2) - {x})

Basically you are delaying the construction of closures.   This  might  be  a
win,  but  it  might be a loss.  If you can avoid building the closure at all
(lets say because of some inlining) then you win.  On  the  other  hand,  you
might  end  up  building  the  closure  many  times,  in which case you lose.
Perhaps it is still safe-for-space (in our slightly weakened sense  that  the
blow up is a constant dependent on the program), but it is still a loss.

To be specific, consider the following example:

        let val f = fn _ => a + b
            val g = fn _ => f
        in ...
        end

In  the  straight  `flat' closures, each call to g will allocate nothing.  In
your version, each call to g will allocate  the  tuple.   (I'm  ignoring  the
potential  optimization of noticing that the environment for f is the SAME as
that for g, so you can skip the construction.  The example can be  goosed  to
make that not happened.)

Are  you convinced that it really is going to be a win over all?  I don't see
why.  (Mind you, I'm sure that there are programs where it would be.)

--------------------------------------------------------------------------------

From: Henry Cejtin <henry@clairv.com>
To: MLton@research.nj.nec.com, sweeks@intertrust.com
Date: Fri, 15 Oct 1999 19:08:52 -0500
Subject: Re: flatter closure conversion

I  assume  that in the `usual free variable function' you meant the last rule
to be
        FV(let x = e1 in e2 end) = FV(e1) U (FV(e2) - {x})

Basically you are delaying the construction of closures.   This  might  be  a
win,  but  it  might be a loss.  If you can avoid building the closure at all
(lets say because of some inlining) then you win.  On  the  other  hand,  you
might  end  up  building  the  closure  many  times,  in which case you lose.
Perhaps it is still safe-for-space (in our slightly weakened sense  that  the
blow up is a constant dependent on the program), but it is still a loss.

To be specific, consider the following example:

        let val f = fn _ => a + b
            val g = fn _ => f
        in ...
        end

In  the  straight  `flat' closures, each call to g will allocate nothing.  In
your version, each call to g will allocate  the  tuple.   (I'm  ignoring  the
potential  optimization of noticing that the environment for f is the SAME as
that for g, so you can skip the construction.  The example can be  goosed  to
make that not happened.)

Are  you convinced that it really is going to be a win over all?  I don't see
why.  (Mind you, I'm sure that there are programs where it would be.)

--------------------------------------------------------------------------------

From: "Stephen Weeks" <sweeks@intertrust.com>
To: MLton@research.nj.nec.com
Date: Sat, 16 Oct 1999 22:56:34 -0700 (PDT)
Subject: Re: flatter closure conversion

> Basically you are delaying the construction of closures.   This  might  be  a
> win,  but  it  might be a loss.  If you can avoid building the closure at all
> (lets say because of some inlining) then you win.  On  the  other  hand,  you
> might  end  up  building  the  closure  many  times,  in which case you lose.

If this happens, I agree. 

> Are  you convinced that it really is going to be a win over all?  I don't see
> why.  (Mind you, I'm sure that there are programs where it would be.)

What I believe to be the mitigating factor in this situation is the
flattener, and the fact that even when the inner closure appears to be
constructed in the output of the closure converter, after subsequent
optimization, the closure variables will just be passed along.  This
is unfortunately not happening right now because the current flattener
only goes two levels deep.  I also think there will be a space *win*
in practice because of avoiding redundancy in closures.  I also think
there will be a win in compile time, because the output of the closure
converter will have much smaller types, and hence subsequent cps
optimization (esp constant propagation, useless, and flattening) will
be better.

If you look at the cps output for, say, the self compile, you will see 
lots of deeply nested types for closure records, many of which appear
to contain redundant components.

--------------------------------------------------------------------------------

From: Henry Cejtin <henry@clairv.com>
To: sweeks@intertrust.com
Date: Sun, 17 Oct 1999 21:01:45 -0500
Subject: Re: flatter closure conversion

Isn't this a sign that the closure converter would be better off by actually
looking at the flow graph to decide when to construct the tuple?  Mind you,
it would make things more complicated.

--------------------------------------------------------------------------------

From: "Stephen Weeks" <sweeks@wasabi.epr.com>
To: Henry Cejtin <henry@clairv.com>
Date: Mon, 18 Oct 1999 10:13:30 -0700 (PDT)
Subject: Re: flatter closure conversion

> Isn't this a sign that the closure converter would be better off by actually
> looking at the flow graph to decide when to construct the tuple?  Mind you,
> it would make things more complicated.

In principle, yes.  But given the subsequent flatenning, it's pretty
hard to figure out much at compile time.  After I implement the naive
version (delay everything), maybe I'll look into trickier versions.

--------------------------------------------------------------------------------

_______________________________________________
MLton mailing list
MLton@mlton.org
http://www.mlton.org/mailman/listinfo/mlton