val _ = () and exception optimization in MLton
Stephen Weeks
sweeks@intertrust.com
Fri, 11 Aug 2000 13:46:32 -0700 (PDT)
Henry writes (regarding the new exception optimization):
> I certainly think that any optimization which isn't super expensive is good,
> but I wouldn't have thought that this would buy much of anything. It will
> only help in a procedure which has a handle expression and such that within
> the scope of the handler we make non-tail calls but only to procedures which
> can't raise an exception. Even in these cases (which I would think would be
> quite rare), it only saves the cost of basically one cons (onto the list
> are the current handlers), and the space being allocated for the `pair' is
> on the stack.
> I'm curious what the case was that inspired the optimization.
It helps in more cases than you mention.
It was originally inspired so that val _ = () would produce a program with a
single declaration, MLton_halt(0). Obviously very important :-)
More seriously, it means that you only pay for handlers when you make nontail
calls. Handlers are guaranteed to be as fast as jumps for local exits. That is
nice.
More specifically, they help the raise-to-jump optimization become zero cost.
Consider the following SML program.
fun foo n: int =
if n = 0
then raise Fail "yes"
else foo(n - 1)
val _ = if 14 = (foo 13 handle _ => 14)
then ()
else raise Fail "bug"
With the old MLton, this would generate CPS code that did the following.
1. push that handler
2. do the foo loop
3. jump to the handler
4. pop the handler
5. exit.
That is, even though the handler was known, we still had to push it and pop it.
Here is a snippet the CPS code for the old MLton.
fun L_19(x_1008) =
let
val _ = HandlerPop
val x_1071 = MLton_halt(global_4)
in
L_1(x_833)
end
val _ = HandlerPush L_19
fun foo_3(x_1010) =
let
val x_1011 = MLton_eq(x_1010, global_4)
fun L_21() =
let
val x_1012 = Int_sub(x_1010, global_3)
in
foo_3(x_1012)
end
fun L_22() =
L_19(x_950)
in
case x_1011 of
false => L_21 | true => L_22
end
in
foo_3(global_2)
end
I'll let the CPS code for the new MLton speak for itself.
fun main_0() =
let
fun L_40() =
let
val x_809 = MLton_halt(global_4)
in
Bug
end
fun foo_4(x_872) =
let
val x_873 = MLton_eq(x_872, global_4)
fun L_28() =
let
val x_874 = Int_sub(x_872, global_3)
in
foo_4(x_874)
end
in
case x_873 of
false => L_28 | true => L_40
end
in
foo_4(global_2)
end
Note the complete disappearance of handlers. So, the raise-to-jump optimization
does exactly what we want, and (as I suspect is the common case) can remove the
handler entirely.
The optimization will also let us completely remove handlers that are unused,
since the handler pushes and pops will be eliminated entirely.
All in all, it is a cheap optimization (a simple fixed point and two passes over
the program) that almost can't hurt. The way it's done right now, it wraps a
push and pop of the desired handler around every nontail call. It would be
better to do a little local control flow analysis within the function to avoid
pops followed by pushes of the same handler, but I'll wait and see if it looks
like a problem.