val _ = () and exception optimization in MLton

Fri, 11 Aug 2000 13:46:32 -0700 (PDT)

Henry writes (regarding the new exception optimization):

> I certainly think that any optimization which isn't super expensive is good,
> but I wouldn't have thought that this would buy much of anything.  It will
> only help in a procedure which has a handle expression and such that within
> the scope of the handler we make non-tail calls but only to procedures which
> can't raise an exception.  Even in these cases (which I would think would be
> quite rare), it only saves the cost of basically one cons (onto the list
> are the current handlers), and the space being allocated for the `pair' is
> on the stack.
> I'm curious what the case was that inspired the optimization.

It helps in more cases than you mention.

It was originally inspired so that val _ = () would produce a program with a
single declaration, MLton_halt(0).  Obviously very important :-)

More seriously, it means that you only pay for handlers when you make nontail
calls.  Handlers are guaranteed to be as fast as jumps for local exits.  That is
nice.

More specifically, they help the raise-to-jump optimization become zero cost.
Consider the following SML program.

fun foo n: int =
   if n = 0
      then raise Fail "yes"
   else foo(n - 1)

val _ = if 14 = (foo 13 handle _ => 14)
	   then ()
	else raise Fail "bug"

With the old MLton, this would generate CPS code that did the following.
  1. push that handler
  2. do the foo loop
  3. jump to the handler
  4. pop the handler
  5. exit.
That is, even though the handler was known, we still had to push it and pop it.
Here is a snippet the CPS code for the old MLton.

      fun L_19(x_1008) = 
	  let
	     val _ = HandlerPop
	     val x_1071 = MLton_halt(global_4)
	  in
	     L_1(x_833)
	  end
      val _ = HandlerPush L_19
      fun foo_3(x_1010) = 
	  let
	     val x_1011 = MLton_eq(x_1010, global_4)
	     fun L_21() = 
		 let
		    val x_1012 = Int_sub(x_1010, global_3)
		 in
		    foo_3(x_1012)
		 end
	     fun L_22() = 
		 L_19(x_950)
	  in
	     case x_1011 of
	       false => L_21 | true => L_22
	  end
   in
      foo_3(global_2)
   end

I'll let the CPS code for the new MLton speak for itself.

fun main_0() = 
   let
      fun L_40() = 
	 let
	    val x_809 = MLton_halt(global_4)
	 in
	    Bug
	 end
      fun foo_4(x_872) = 
	 let
	    val x_873 = MLton_eq(x_872, global_4)
	    fun L_28() = 
	       let
		  val x_874 = Int_sub(x_872, global_3)
	       in
		  foo_4(x_874)
	       end
	 in
	    case x_873 of
	      false => L_28 | true => L_40
	 end
   in
      foo_4(global_2)
   end

Note the complete disappearance of handlers.  So, the raise-to-jump optimization 
does exactly what we want, and (as I suspect is the common case) can remove the
handler entirely.

The optimization will also let us completely remove handlers that are unused,
since the handler pushes and pops will be eliminated entirely.

All in all, it is a cheap optimization (a simple fixed point and two passes over
the program) that almost can't hurt.  The way it's done right now, it wraps a
push and pop of the desired handler around every nontail call.  It would be
better to do a little local control flow analysis within the function to avoid
pops followed by pushes of the same handler, but I'll wait and see if it looks
like a problem.