local refs
Matthew Fluet
fluet@CS.Cornell.EDU
Thu, 29 Nov 2001 21:44:47 -0500 (EST)
I was getting tired of looking at the code for printing
"toplevel handler not installed" in every program when the toplevel
handler was clearly installed. So, I started in on a local ref
elimination pass, which will also turn accumulation refs into function
local variables.
It's a little over half done, I'd say. The structure is as follows:
for each global
record if it is a Ref_ref and what functions it is use in
for each function
determine localizable refs: variables are of type ref and are only
ever created, assigned to, or derefed (ie., that don't escape into any
datatype, tuple, non-ref related prim argument, or control transfer
argument); this also includes any globals that are only used in this
function
do a dfs walk of the function, recording for each block the set of
localizable refs in scope
(Here, we need to do an analysis to determine which blocks really need
to pass the refs around as arguments; it should be a minimal useless
type analysis, I just haven't implemented it yet. There's no big harm
in always passing the refs into and out of each block, it just means
more work for the shrinker and removeUnused. Also, since there is no
useless pass after localRef, nothing eliminates block arguments that
are the same at all call sites.)
each localizable ref maintains a stack of Var.t's, the top of which is
variable holding the current value of the ref
do a dfs walk of the function,
at each block, add arguments for localizable refs that need to be
passed at this block; push these arguments onto the localizable ref
stack
for each statement, rewrite Ref_ref to a simple copy, rewrite assigns
to a simple copy into a new variable (and pop the top of the stack,
and push the new variable), rewrite derefs intoa simple copy from the
variable at the tope of the stack
for each transfer to a block which needs localizable ref args, create
a "route" block that passes the tops of the stacks as additional args
in a goto to the destination block; (we need a "route" block that has
the same argument type as the original block, because cases, prim,
conts, etc. can't change the argument type; a lot of these will get
shrunk away)
after processing all the blocks' children, pop the top of the stack
for each localizable ref that is passed at this block
That's it. It works as I expect on small examples, but there's a problem
with handlers.
Consider:
val pi' = _ffi "printf": string * int -> int;
val pi = fn i => pi'("%i\n\000", i)
val ps' = _ffi "printf": string * string -> int;
val ps = fn s => ps'("%i\n\000", s)
fun fib 0 = 1
| fib 1 = 1
| fib n = (fib (n + 1)) + (fib (n + 1))
fun tak (x,y,z)
= if not (y < x)
then z
else tak (tak (x - 1, y, z),
tak (y - 1, z, x),
tak (z - 1, x, y))
val w = MLton.Random.rand ()
val func = ref ""
val n = (if Word.andb(w,0wx1) = 0wx0
then (func := "fib";
fib 20)
else (func := "tak";
tak (1,2,3)))
handle _ => (ps (!func); 0)
val _ = pi n
I'd like to localize the func ref; and according to my definition above,
it is localizable, because it is only used in the main function
and is only created, assigned to, and derefed (ie., doesn't escape
anywhere else). But, the SSA code looks like
L_11 ()
x_13: lambdas_0 = Env_1 ()
Ref_assign(lambdas_0) (x_0, x_13)
x_12: word = Ref_deref(word) (global_13)
x_11: word = Word32_mul (global_14, x_12)
x_10: word = Word32_add (x_11, global_15)
Ref_assign(word) (global_13, x_10)
HandlerPush L_5
x_9: word = Word32_andb (x_10, global_18)
x_8: bool = MLton_eq(word) (x_9, global_19)
case x_8 of
false => L_8 | true => L_10
L_10 ()
Ref_assign(string) (global_17, global_20)
L_9 (fib_0 (global_21)) handle L_5
L_9 (x_7: int)
L_6 (x_7)
L_8 ()
Ref_assign(string) (global_17, global_22)
L_7 (tak_0 (global_23)) handle L_5
L_7 (x_6: int)
L_6 (x_6)
L_6 (x_5: int)
HandlerPop L_5
L_4 (x_5)
L_5 ()
HandlerPop L_5
x_4: string = Ref_deref(string) (global_17)
printf (global_24, x_4)
L_4 (global_0)
Just what you would expect. I need to "route" the new values of func to
the handler at L_5, so I get something like this:
L_46 ()
x_13: lambdas_0 = Env_1 ()
x_11: word = Word32_mul (global_14, global_31)
x_10: word = Word32_add (x_11, global_15)
x_9: word = Word32_andb (x_10, global_18)
x_8: bool = MLton_eq(word) (x_9, global_19)
case x_8 of
false => L_42 | true => L_45
L_45 ()
L_43 (fib_0 (global_21)) handle L_44
L_44 ()
L_5 (x_13, global_20, x_10)
L_43 (x_41: int)
L_6 (x_41, x_13, global_20, x_10)
L_42 ()
L_40 (tak_0 (global_23)) handle L_41
L_41 ()
L_5 (x_13, global_22, x_10)
L_5 (x_40: lambdas_0, global_38: string, global_37: word)
printf (global_24, global_38)
L_4 (global_0, x_40, global_38, global_37)
L_40 (x_39: int)
L_6 (x_39, x_13, global_22, x_10)
L_6 (x_5: int, x_38: lambdas_0, global_36: string, global_35: word)
L_4 (x_5, x_38, global_36, global_35)
Actually, you see here that we "route" all the localizable refs to the
handler (and everywhere else); (the lambdas_0 is the toplevel handler ref,
and the word is the MLton.Random seed ref.)
Now, you see the problem. I need to change the handlers at the non-tail
calls, because each call needs to pass a different value for func.
The shrinker has taken the liberty of removing the HandlerPush L_5,
because L_5 isn't used in a handle slot.
I think the useless analysis should figure out that we don't need to route
the lambdas_0 and word localizable refs through the handlers (the value
of each of these refs is the same at the point the handler is installed
and at the point of handler invocation); But, I still need to do something
about the func ref. In this case, there are two choices: 1. make func
non-localizable, 2. rewrite the handlers.
Until the exceptions paper, I guess we need to live with 1. I suspect
that I can fudge the useless analysis to mark func as non-localizable.
After the exceptions paper, we could go either way. The code I have right
now would automatically do 2, although there are arguments for sticking
with 1 (with 2, we need to install 2 handlers (one down each branch)
instead of 1 handler (above the branch)).
Comments? Suggestions for the useless analysis? Suggestions for the
kludge?