[MLton] ref flattening and backpatching
Matthew Fluet
fluet@cs.cornell.edu
Fri, 17 Sep 2004 15:46:00 -0400 (EDT)
One not so uncommon use of ref's in SML is to backpatch circular
data-structures. My understanding of the ref flattening analysis is that
it will not capture this case, because the constructed ref is both put
into a tuple and assigned to directly. Consider the following program:
structure CList =
struct
datatype 'a clist' = Cons of 'a * 'a clist ref
withtype 'a clist = 'a clist' option
fun cnil () = NONE
fun ccons (h, t) = SOME (Cons (h, ref t))
fun match cl nilCase consCase =
case cl of
NONE => nilCase ()
| SOME (Cons (h, t)) => consCase (h, !t)
fun fromList l =
case l of
[] => cnil ()
| h::t => ccons (h, fromList t)
fun repeat x =
let
val r = ref NONE
val cl = SOME (Cons (x, r))
val () = r := cl
in
cl
end
local
val max = 1000
fun length' (cl, n) =
if n >= max
then NONE
else match cl
(fn () => SOME n)
(fn (_,t) => length' (t, n + 1))
in
fun length cl = length' (cl, 0)
end
end
val cl = CList.repeat #"x"
val n = CList.length cl
val () =
case n of
NONE => print "NONE\n"
| SOME n => print (concat ["SOME ", Int.toString n, "\n"])
val cl = CList.fromList [1,2,3,4,5,6,7,8,9]
val n = CList.length cl
val () =
case n of
NONE => print "NONE\n"
| SOME n => print (concat ["SOME ", Int.toString n, "\n"])
Looking at refFlatten.pre and refFlatten.post, I see the following:
refFlatten.pre:
option_0 = NONE_2 of (unit) | SOME_0 of ((option_0 ref))
option_1 = NONE_0 of (unit) | SOME_1 of ((option_1 ref))
refFlatten.post:
option_0 = NONE_2 of (unit) | SOME_0 of (option_0 ref)
option_1 = NONE_0 of (unit) | SOME_1 of ((option_1 ref))
This makes sense -- monomorphisation duplicated the code for the
char CList.clist and the int CList.clist, but since the list elements are
unused, they are both simplified. (Interestingly, while one might
consider it advantageous to unify isomorphic types, by keeping them
separate, the refFlatten analysis that works per datatype is more
accurate.) The int CList.clist doesn't do anything interesting with the
ref component, so no surprise that it is flattened.
It should also be the case that the int CList.clist never mutates the ref
component. I suspect it's fairly trivial to extend the refFlatten
analysis to also determine when the ref can be flattened as an immutable
field (and, less interesting, the converse, when the ref is only written
and never read, in which case the field is essentially useless).
Although, I'm not sure whether this buys much -- as far as I know, the
only additional cost we pay for a flattened ref is the card mark when it
is mutated, which would be zero in the case I'm describing.
Unfortunately, the char CList.clist is not flattened. As I understand it,
this arises because of the repeat function, which creates the ref, puts
it in the datatype, and also directly updates the ref. The ssa2 for this
sequence looks like:
r_0: (option_1 ref) = (global_24)
x_671: SOME_1 = SOME_1 (r_0)
cl_0: option_1 = x_671: option_1
r_0 := cl_0
length'_1 (global_4, cl_0) NonTail {cont = L_317, handler = Dead}
Note that cl_0 (hence, x_671) is live after the update, so we'd be well
within space safety to update the (flattened) ref component of x_671.
I suspect that it would also be possible to accomodate this in the
refFlatten analysis. A simple option would be to be a little more liberal
with a ref in the block in which it is created -- i.e., as in the case
above, if the ref were flattened, we could still "get at it" in the
data-structure.