[MLton] cvs commit: decreasing liveness information in large SSA functions

Stephen Weeks sweeks@mlton.org
Thu, 1 Jul 2004 07:27:45 -0700


sweeks      04/07/01 07:27:43

  Modified:    mlton/control control.sig control.sml
               mlton/main main.fun
               mlton/ssa zone.fun
  Log:
  MAIL decreasing liveness information in large SSA functions
  
  Implemented a pass ("zone") that runs at the end of the SSA simplifier
  pipeline.  This pass breaks breaks large SSA functions into zones,
  which are connected subgraphs of the dominator tree.  For each zone,
  at the node that dominates the zone (the "zone root"), it places a
  tuple collecting all of the live variables at that node.  It replaces
  any variables used in that zone with offsets from the tuple.
  
  The zones are chosen by walking down the dominator tree and
  recursively "cutting" (choosing a zone root) at every n'th node.  At
  some point, I may switch to an algorithm like the one I described
  early that tries to make zones of equal size.
  
  There are two flags that govern the use of this pass
  
  	-max-function-size <n>
  	-zone-cut-depth <n>
  
  Zone splitting only works when the number of basic blocks in a
  function is > n.  The n used to cut the dominator tree is set by
  -zone-cut-depth.
  
  For now, there is no attempt made to keep the zone roots outside of
  loops.  For main, we rely on -inline-into-main false to ensure there
  are no loops in main.  It should be easy enough to use loop forests to
  fix this.  I've run loopForestSteensgard on Joe's main function
  without any time problems.
  
  There is currently no attempt to be safe-for-space.  That is, the
  tuples are not restricted to containing only "small" values.
  
  This hack significantly improves the liveness problems we've been
  seeing in HOL.  For HOL, the RSSA program size is about 200M.  A
  couple of days ago, we were at about 920M for the machine program
  size, most of which is liveness info.  Switching to -inline-into-main
  false cut that down to 750M.  With zone splitting, we're down to 250M,
  which is acceptable, since there's always some blowup in going from
  RSSA to machine due to things other than liveness.
  
  The big remaining problem is compile time.  It takes 8.6 hours to
  compile HOL.  Here's the breakdown.
  
        pre codegen finished in 456.25 + 260.84 (36% GC)
        x86 code gen finished in 24579.91 + 5485.26 (18% GC)
        Compile C and Assemble finished in 52.59 + 0.00 (0% GC)
        Link finished in 204.34 + 0.00 (0% GC)
  
  The pre codegen only has one performance bug left for this program --
  localRef is taking ~165 seconds (it should take ~5).  Hopefully this
  is as easy as the quadratic problem with commonArg.  With that fix, we
  should be at 10min for the pre codegen, which is acceptable.  I had
  been worried that there would be a problem with the large C file for
  main, which is about 8M.  There are also 74 assembly files, two of
  which are quite large (30M and 50M), with most of the rest being
  closer to the usual 0.5M.  The C compiler and assembler handle these
  well.
  
  Matthew, if you can look into why the x86-codegen is taking so long
  that would be great.  Let me know if you need any other data.  Here
  are the flags I compiled with.
  
  	-basis 1997     \
  	-keep g \
  	-inline-into-main false \
  	-verbose 3      \
  
  One other piece of data.  It's not a property list problem.  Here are
  the stats before and after the codegen.
  
  before:
     numPeeks = 742821096
     maxLength = 12
     average position in property list = 0.081
  
  after:
     numPeeks = 772932930
     maxLength = 12
     average position in property list = 0.115

Revision  Changes    Path
1.102     +2 -0      mlton/mlton/control/control.sig

Index: control.sig
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/control/control.sig,v
retrieving revision 1.101
retrieving revision 1.102
diff -u -r1.101 -r1.102
--- control.sig	29 Jun 2004 23:22:10 -0000	1.101
+++ control.sig	1 Jul 2004 14:27:43 -0000	1.102
@@ -301,6 +301,8 @@
       val xmlPassesSet: (string -> string list Result.t) ref
       val xmlPasses: string list ref
 
+      val zoneCutDepth: int ref
+
       (*------------------------------------*)
       (*             End Flags              *)
       (*------------------------------------*)



1.127     +7 -2      mlton/mlton/control/control.sml

Index: control.sml
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/control/control.sml,v
retrieving revision 1.126
retrieving revision 1.127
diff -u -r1.126 -r1.127
--- control.sml	29 Jun 2004 23:22:10 -0000	1.126
+++ control.sml	1 Jul 2004 14:27:43 -0000	1.127
@@ -583,15 +583,20 @@
 			  default = false,
 			  toString = Bool.toString}
 
-val xmlPassesSet : (string -> string list Result.t) ref = 
+val xmlPassesSet: (string -> string list Result.t) ref = 
    control {name = "xmlPassesSet",
 	    default = fn _ => Error.bug ("xmlPassesSet not installed"),
 	    toString = fn _ => "<xmlPassesSet>"}
-val xmlPasses : string list ref = 
+val xmlPasses: string list ref = 
    control {name = "xmlPasses",
 	    default = ["default"],
 	    toString = List.toString String.toString}
 
+val zoneCutDepth: int ref =
+   control {name = "zone cut depth",
+	    default = 100,
+	    toString = Int.toString}
+   
 datatype style = No | Assembly | C | Dot | ML
 
 fun preSuf style =



1.44      +3 -1      mlton/mlton/main/main.fun

Index: main.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/main/main.fun,v
retrieving revision 1.43
retrieving revision 1.44
diff -u -r1.43 -r1.44
--- main.fun	29 Jun 2004 23:22:11 -0000	1.43
+++ main.fun	1 Jul 2004 14:27:43 -0000	1.44
@@ -438,7 +438,9 @@
 	(fn s =>
 	 case !Control.xmlPassesSet s of
 	    Result.Yes ss => Control.xmlPasses := ss
-	  | Result.No s' => usage (concat ["invalid -xml-pass arg: ", s'])))
+	  | Result.No s' => usage (concat ["invalid -xml-pass arg: ", s']))),
+       (Expert, "zone-cut-depth", " <n>", "zone cut depth",
+	intRef zoneCutDepth)
        ],
        fn (style, name, arg, desc, opt) =>
        {arg = arg, desc = desc, name = name, opt = opt, style = style})



1.4       +193 -41   mlton/mlton/ssa/zone.fun

Index: zone.fun
===================================================================
RCS file: /cvsroot/mlton/mlton/mlton/ssa/zone.fun,v
retrieving revision 1.3
retrieving revision 1.4
diff -u -r1.3 -r1.4
--- zone.fun	30 Jun 2004 21:34:41 -0000	1.3
+++ zone.fun	1 Jul 2004 14:27:43 -0000	1.4
@@ -13,6 +13,7 @@
 type int = Int.t
 
 structure Set = DisjointSet
+
 structure Graph = DirectedGraph
 local
    open Graph
@@ -20,60 +21,211 @@
    structure LoopForest = LoopForest
    structure Node = Node
 end
-   
-fun zoneFunction (f, ac) =
+
+structure Scope = UniqueId ()
+
+fun zoneFunction f =
    let
-      val {args, blocks, mayInline, name, raises, returns, start} =
-	 Function.dest f
-   in
-      if Vector.length blocks <= !Control.maxFunctionSize
-	 then f :: ac
-      else
+      val {args, mayInline, name, raises, returns, start, ...} = Function.dest f
+      datatype z = datatype Exp.t
+      val {get = labelInfo: Label.t -> {isCut: bool ref}, ...} =
+	 Property.get (Label.plist,
+		       Property.initFun (fn _ => {isCut = ref false}))
+      val dominatorTree = Function.dominatorTree f
+      (* Decide which labels to cut at. *)
+      val cutDepth = !Control.zoneCutDepth
+      fun addCuts (Tree.T (b, ts), depth: int) =
+	 let
+	    val depth =
+	       if depth = 0
+		  then
+		     let
+			val Block.T {label, ...} = b
+			val {isCut, ...} = labelInfo label
+			val () = isCut := true
+		     in
+			cutDepth
+		     end
+	       else depth - 1
+	 in
+	    Vector.foreach (ts, fn t => addCuts (t, depth))
+	 end
+      val () = addCuts (dominatorTree, cutDepth)
+      (* Build a tuple of lives at each cut node. *)
+      type info = {componentsRev: Var.t list ref,
+		   numComponents: int ref,
+		   scope: Scope.t,
+		   tuple: Var.t}
+      fun newInfo () =
+	 {componentsRev = ref [],
+	  numComponents = ref 0,
+	  scope = Scope.new (),
+	  tuple = Var.newNoname ()}
+      datatype varInfo =
+	 Global
+	| Local of {blockCache: Var.t option ref,
+		    defScope: Scope.t,
+		    ty: Type.t,
+		    uses: {exp: Exp.t,
+			   scope: Scope.t} list ref}
+      val {get = varInfo: Var.t -> varInfo,
+	   set = setVarInfo, ...} =
+	 Property.getSetOnce (Var.plist,
+			      Property.initFun (fn _ => Global))
+      val blockSelects: {blockCache: Var.t option ref,
+			 statement: Statement.t} list ref = ref []
+      fun addBlockSelects (ss: Statement.t vector): Statement.t vector =
 	 let
-	    val G = Graph.new ()
-	    type node = unit Node.t
-	    val {get = labelInfo: Label.t -> {node: node}, ...} =
-	       Property.get (Label.plist,
-			     Property.initFun (fn _ => {node = Graph.newNode G}))
-	    val labelNode = #node o labelInfo
-	    (* Build control-flow graph. *)
-	    val () =
-	       Vector.foreach
-	       (blocks, fn Block.T {label, transfer, ...} =>
+	    val blockSelectsV = Vector.fromList (!blockSelects)
+	    val () = Vector.foreach (blockSelectsV, fn {blockCache, ...} =>
+				     blockCache := NONE)
+	    val () = blockSelects := []
+	 in
+	    Vector.concat [Vector.map (blockSelectsV, #statement), ss]
+	 end
+      fun define (x: Var.t, ty: Type.t, info: info): unit =
+	 setVarInfo (x, Local {blockCache = ref NONE,
+			       defScope = #scope info,
+			       ty = ty,
+			       uses = ref []})
+      fun replaceVar (x: Var.t,
+		      {componentsRev, numComponents, scope, tuple}: info)
+	 : Var.t =
+	 case varInfo x of
+	    Global => x
+	  | Local {blockCache, defScope, ty, uses, ...} =>
+	       case !blockCache of
+		  SOME y => y
+		| _ => 
+		     if Scope.equals (defScope, scope)
+			then x
+		     else
+			let
+			   fun new () =
+			      let
+				 val offset = !numComponents
+				 val () = List.push (componentsRev, x)
+				 val () = numComponents := 1 + offset
+				 val exp = Select {object = tuple,
+						   offset = offset}
+				 val () = List.push (uses, {exp = exp,
+							    scope = scope})
+			      in
+				 exp
+			      end
+			   val exp =
+			      case !uses of
+				 [] => new ()
+			       | {exp, scope = scope'} :: _ =>
+				    if Scope.equals (scope, scope')
+				       then exp
+				    else new ()
+			   val y = Var.new x
+			   val () = blockCache := SOME y
+			   val () =
+			      List.push
+			      (blockSelects,
+			       {blockCache = blockCache,
+				statement = Statement.T {exp = exp,
+							 ty = ty,
+							 var = SOME y}})
+			in
+			   y
+			end
+      val blocks = ref []
+      fun loop (Tree.T (b, ts), info: info) =
+	 let
+	    val Block.T {args, label, statements, transfer} = b
+	    val {isCut = ref isCut, ...} = labelInfo label
+	    val info' = 
+	       if isCut
+		  then newInfo ()
+	       else info
+	    val define = fn (x, t) => define (x, t, info')
+	    val () = Vector.foreach (args, define)
+	    val statements =
+	       Vector.map
+	       (statements, fn Statement.T {exp, ty, var} =>
 		let
-		   val {node = from, ...} = labelInfo label
+		   val exp = Exp.replaceVar (exp, fn x => replaceVar (x, info'))
+		   val () = Option.app (var, fn x => define (x, ty))
 		in
-		   Transfer.foreachLabel
-		   (transfer, fn l =>
-		    ignore (Graph.addEdge (G, {from = from, to = labelNode l})))
+		   Statement.T {exp = exp, ty = ty, var = var}
 		end)
-	    fun layout f =
-	       let
-		  val {loops, notInLoop} = LoopForest.dest f
-	       in
-		  Layout.record
-		  [("notInLoop", Int.layout (Vector.length notInLoop)),
-		   ("loops", Vector.layout (layout o #child) loops)]
-	       end
-	    datatype count = Many | None | One of Label.t
-	    (* Display classes. *)
-	    val () =
-	       Control.diagnostics
-	       (fn display =>
-		(display (Func.layout name)
-		 ; display (layout
-			    (Graph.loopForestSteensgaard
-			     (G, {root = labelNode start})))))
+	    val transfer =
+		Transfer.replaceVar (transfer, fn x => replaceVar (x, info'))
+	    val statements = addBlockSelects statements
+	    val () = Vector.foreach (ts, fn t => loop (t, info'))
+	    val statements =
+	       if not isCut
+		  then statements
+	       else
+		  let
+		     val {componentsRev, tuple, ...} = info'
+		     val components = Vector.fromListRev (!componentsRev)
+		  in
+		     if 0 = Vector.length components
+			then statements
+		     else
+			let
+			   val componentTys =
+			      Vector.map
+			      (components, fn x =>
+			       case varInfo x of
+				  Global => Error.bug "global component"
+				| Local {ty, uses, ...} =>
+				     (ignore (List.pop uses)
+				      ; {elt = ty,
+					 isMutable = false}))
+			   val components =
+			      Vector.map (components, fn x =>
+					  replaceVar (x, info))
+			   val s =
+			      Statement.T
+			      {exp = Object {args = components, con = NONE},
+			       ty = Type.tuple componentTys,
+			       var = SOME tuple}
+			in
+			   addBlockSelects (Vector.concat [Vector.new1 s,
+							   statements])
+			end
+		  end
+	    val () = List.push (blocks,
+				Block.T {args = args,
+					 label = label,
+					 statements = statements,
+					 transfer = transfer})
 	 in
-	    f :: ac
+	    ()
 	 end
+      val () = loop (dominatorTree, newInfo ())
+      val blocks = Vector.fromList (!blocks)
+   in
+      Function.new {args = args,
+		    blocks = blocks,
+		    mayInline = mayInline,
+		    name = name,
+		    raises = raises,
+		    returns = returns,
+		    start = start}
+   end
+
+fun maybeZoneFunction (f, ac) =
+   let
+      val {args, blocks, mayInline, name, raises, returns, start} =
+	 Function.dest f
+   in
+      if Vector.length blocks <= !Control.maxFunctionSize
+	 then f :: ac
+      else zoneFunction f :: ac
    end
    
 fun zone (Program.T {datatypes, globals, functions, main}) =
    Program.T {datatypes = datatypes,
 	      globals = globals,
-	      functions = List.fold (functions, [], zoneFunction),
+	      functions = List.fold (functions, [], maybeZoneFunction),
 	      main = main}
 
 end
+