[MLton] an analysis to flatten refs into tuples

Sat, 15 May 2004 14:48:48 -0700

I am planning on implementing an SSA pass to flatten ref cells into
tuples.  The idea is to replace, where possible, a type like

	(int ref * real)

with a type like

	(int[m] * real)

where the "[m]" indicates a mutable field of a tuple. 

The savings is obvious, I hope.  We avoid an extra heap-allocated
object for the ref, which in the above case saves two words.  We also
save the time and code for the extra indirection at each get and set.
There are lots of useful data structures (singly-linked and
doubly-linked lists, union-find, fibonacci heaps, ...) that I believe
we are paying through the nose right now because of the absence of ref
flattening.

Obviously, implementing this requires changing the SSA type system and
IL to add the new mutable field in tuple types and new expressions to
get and set mutable fields.  I think that's all pretty
straightforward.  More challenging is the analysis to determine when
such a transformation is safe.  For example, it is unsafe if the ref
is shared with another data structure or needs to be used in a
first-class way.

Here's a simple analysis that I think is sound and gets lots of cases.
The idea is to combine a local analysis with a simple global dataflow
analysis.  The local analysis checks when a ref creation corresponds
to a tuple creation (i.e., we see val x = ref y; val z = (x, w)) and
when a ref that is extracted is only ever assigned to (i.e. we see val
r = #2 z; val () = r := x).  The global analysis uses abstract values
similar to those we use for whole-program constant propagation or
useless analysis.  Abstract values mirror the type structure, and
every abstract tuple field that contains a ref has an annotation that
says whether the ref can be flattened or not, with the default being
to flatten.  We then do a union-find-based flow analysis over the
whole program to make sure that the annotations are consistent (we
could use <= constraints, but I don't want to think about inserting
the corresponding coercions, which may not even be possible).
Finally, according to the results of the local analysis, we mark
certain fields as not flattenable (and the union-find guarantees we
are consistent).

The transformation according to the annotations is then
straightforward.  We change the types, introducing mutable fields
where the refs are flattened, and we replace gets and sets of
flattened refs with gets and sets of the appropriate tuple field.

Thoughts?

The only problem I can see is one of space safety.  By keeping the
containing tuple alive just so we can access the field, we are
conceivably holding on to data that would otherwise be dead.  In
practice I don't know if this would be bad.  If it is, perhaps the
local analysis can be made strict enough to avoid such cases.