[MLton-commit] r7212
Wesley Terpstra
wesley at mlton.org
Fri Jul 10 09:01:10 PDT 2009
Added a new pass supporting elimination / combination of type conversions.
It is able to eliminate costly conversions through LargeWord, but not IntInf.
IntInf conversions tag variables before conversion which blocks the analysis.
After the combination of conversions, it becomes possible to identify useless
Overflow testing. For example, consider this small program:
val a = ... something that can't be optimized away
val b = (Int16.fromInt o Int8.toInt) a
val () = print (Int16.toString b ^ "\n")
Without the pass, the SSA looks like this:
L_942 (a_38: word8)
x_31588: word32 = WordS8_extdToWord32 (a_38)
x_33834: word16 = WordS32_extdToWord16 (x_31588)
x_33833: word32 = WordS16_extdToWord32 (x_33834)
x_34117: bool = Word32_equal (x_31588, x_33833)
case x_34117 of
true => L_3098 | false => L_3097
L_3097 ()
L_5 (x_31586, global_21)
L_3098 ()
Thread_atomicBegin ()
x_33836: word32 = Thread_atomicState ()
x_34116: bool = Word32_equal (x_33836, global_17)
case x_34116 of
true => L_3101 | false => L_3100
Just after combineConversions it compiles to:
L_942 (a_38: word8)
x_31586: word32 = WordS8_extdToWord32 (a_38)
x_33832: word16 = WordS8_extdToWord16 (a_38)
x_33831: word32 = WordS8_extdToWord32 (a_38)
x_34115: bool = Word32_equal (x_31586, x_33831)
case x_34115 of
true => L_3098 | false => L_3097
L_3097 ()
L_5 (x_31584, global_21)
L_3098 ()
Thread_atomicBegin ()
x_33834: word32 = Thread_atomicState ()
x_34114: bool = Word32_equal (x_33834, global_17)
case x_34114 of
true => L_3101 | false => L_3100
The Overflow test can now be eliminated since x_33831 and x_31586 are the
same expression and the bool is always equal:
L_942 (a_38: word8)
x_33832: word16 = WordS8_extdToWord16 (a_38)
Thread_atomicBegin ()
x_33834: word32 = Thread_atomicState ()
x_34114: bool = Word32_equal (global_17, x_33834)
case x_34114 of
true => L_3101 | false => L_3100
The algorithm implemented works as follows:
* It processes each block in dfs order: (to visit definition before uses)
* If the statement is not a PrimApp Word_extdToWords, skip it.
* After processing a conversion, it tags the Var for subsequent use.
* When inspecting a conversion, check if the Var operated on is also the
* result of a conversion. If it is, try to combine the two operations.
* Repeatedly simplify until hitting either a non-conversion Var or a
* case where the conversions cause an effect.
*
* The optimization rules are very simple:
* x1 = ...
* x2 = Word_extdToWord (W1, W2, {signed=s1}) x1
* x3 = Word_extdToWord (W2, W3, {signed=s2}) x2
*
* W1 = width(x1), W2 = width(x2), W3 = width(x3)
*
* If W1=W2, then there is no conversions before x_1.
* This is guaranteed because W2=W3 will always trigger optimization.
*
* Case W1 <= W3 <= W2:
* x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
* Case W1 < W2 < W3 AND (NOT signed1 OR signed2):
* x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
* Case W1 = W2 < W3
* do nothing; there are no conversions past W1 and x2 = x1.
*
* Case W3 <= W2 <= W1: ]
* x_3 = Word_extdToWord (W1, W3, {whatever}) x1 ] W3 <= W1 && W3 <= W2
* Case W3 <= W1 <= W2: ] just clip x1
* x_3 = Word_extdToWord (W1, W3, {whatever}) x1 ]
*
* Case W2 < W1 <= W3: unoptimized ] W2 < W1 && W2 < W3
* Case W2 < W3 <= W1: unoptimized ] has side-effect: truncation
*
* Case W1 < W2 < W3 AND signed1 AND (NOT signed2): unoptimized
* ... each conversion affects the result separately
I ran the benchmark suite three times and only 'checksum' has a significant
and reproducible change:
MLton0 -- mlton -drop-pass combineConversions
MLton1 -- mlton
run time ratio
benchmark MLton1
checksum 0.45
size
benchmark MLton0 MLton1
checksum 187,726 186,254
compile time
benchmark MLton0 MLton1
checksum 4.52 4.61
run time
benchmark MLton0 MLton1
checksum 36.84 16.43
... which is not terribly surprising since it (and md5sum) are the only
tests which make use of type conversions and md5sum is dominated by md5.
----------------------------------------------------------------------
A mlton/trunk/mlton/ssa/combine-conversions.fun
A mlton/trunk/mlton/ssa/combine-conversions.sig
U mlton/trunk/mlton/ssa/simplify.fun
U mlton/trunk/mlton/ssa/sources.cm
U mlton/trunk/mlton/ssa/sources.mlb
----------------------------------------------------------------------
Added: mlton/trunk/mlton/ssa/combine-conversions.fun
===================================================================
--- mlton/trunk/mlton/ssa/combine-conversions.fun 2009-07-07 21:46:07 UTC (rev 7211)
+++ mlton/trunk/mlton/ssa/combine-conversions.fun 2009-07-10 16:01:09 UTC (rev 7212)
@@ -0,0 +1,150 @@
+(* Copyright (C) 2009 Wesley W. Tersptra.
+ *
+ * MLton is released under a BSD-style license.
+ * See the file MLton-LICENSE for details.
+ *)
+
+functor CombineConversions (S: COMBINE_CONVERSIONS_STRUCTS): COMBINE_CONVERSIONS =
+struct
+
+open S
+
+(*
+ * This pass looks for nested calls to (signed) extension/truncation.
+ *
+ * It processes each block in dfs order: (to visit definition before uses)
+ * If the statement is not a PrimApp Word_extdToWords, skip it.
+ * After processing a conversion, it tags the Var for subsequent use.
+ * When inspecting a conversion, check if the Var operated on is also the
+ * result of a conversion. If it is, try to combine the two operations.
+ * Repeatedly simplify until hitting either a non-conversion Var or a
+ * case where the conversions cause an effect.
+ *
+ * The optimization rules are very simple:
+ * x1 = ...
+ * x2 = Word_extdToWord (W1, W2, {signed=s1}) x1
+ * x3 = Word_extdToWord (W2, W3, {signed=s2}) x2
+ *
+ * W1 = width(x1), W2 = width(x2), W3 = width(x3)
+ *
+ * If W1=W2, then there is no conversions before x_1.
+ * This is guaranteed because W2=W3 will always trigger optimization.
+ *
+ * Case W1 <= W3 <= W2:
+ * x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
+ * Case W1 < W2 < W3 AND (NOT signed1 OR signed2):
+ * x3 = Word_extdToWord (W1, W3, {signed=s1}) x1
+ * Case W1 = W2 < W3
+ * do nothing; there are no conversions past W1 and x2 = x1.
+ *
+ * Case W3 <= W2 <= W1: ]
+ * x_3 = Word_extdToWord (W1, W3, {whatever}) x1 ] W3 <= W1 && W3 <= W2
+ * Case W3 <= W1 <= W2: ] just clip x1
+ * x_3 = Word_extdToWord (W1, W3, {whatever}) x1 ]
+ *
+ * Case W2 < W1 <= W3: unoptimized ] W2 < W1 && W2 < W3
+ * Case W2 < W3 <= W1: unoptimized ] has side-effect: truncation
+ *
+ * Case W1 < W2 < W3 AND signed1 AND (NOT signed2): unoptimized
+ * ... each conversion affects the result separately
+ *)
+
+val { get, set, ... } =
+ Property.getSetOnce (Var.plist, Property.initConst NONE)
+
+fun rules x3 (conversion as ((W2, W3, {signed=s2}), x2)) =
+ let
+ val { <, <=, ... } = Relation.compare WordSize.compare
+
+ fun stop () = set (x3, SOME conversion)
+ fun loop ((W1, _, {signed=s1}), x1) =
+ rules x3 ((W1, W3, {signed=s1}), x1)
+ in
+ case get x2 of
+ NONE => stop ()
+ | SOME (prev as ((W1, _, {signed=s1}), _)) =>
+ if W1 <= W3 andalso W3 <= W2 then loop prev else
+ if W1 < W2 andalso W2 < W3 andalso (not s1 orelse s2)
+ then loop prev else
+ if W3 <= W1 andalso W3 <= W2 then loop prev else
+ (* If W2=W3, we never reach here *)
+ stop ()
+ end
+
+fun markStatement stmt =
+ case stmt of
+ Statement.T { exp = Exp.PrimApp { args, prim, targs=_ },
+ ty = _,
+ var = SOME v } =>
+ (case Prim.name prim of
+ Prim.Name.Word_extdToWord a => rules v (a, Vector.sub (args, 0))
+ | _ => ())
+ | _ => ()
+
+fun mapStatement stmt =
+ let
+ val Statement.T { exp, ty, var } = stmt
+ val exp =
+ case Option.map (var, get) of
+ SOME (SOME (prim as (W2, W3, _), x2)) =>
+ if WordSize.equals (W2, W3)
+ then Exp.Var x2
+ else Exp.PrimApp { args = Vector.new1 x2,
+ prim = Prim.wordExtdToWord prim,
+ targs = Vector.new0 () }
+ | _ => exp
+ in
+ Statement.T { exp = exp, ty = ty, var = var }
+ end
+
+fun combine program =
+ let
+ val Program.T { datatypes, functions, globals, main } = program
+ val shrink = shrinkFunction {globals = globals}
+
+ val functions =
+ List.revMap
+ (functions, fn f =>
+ let
+ (* Traverse blocks in dfs order, marking their statements *)
+ fun markBlock (Block.T {statements, ... }) =
+ (Vector.foreach (statements, markStatement); fn () => ())
+ val () = Function.dfs (f, markBlock)
+
+ (* Map the statements using the marks *)
+ val {args, blocks, mayInline, name, raises, returns, start} =
+ Function.dest f
+
+ fun mapBlock block =
+ let
+ val Block.T {args, label, statements, transfer} = block
+ in
+ Block.T {args = args,
+ label = label,
+ statements = Vector.map (statements, mapStatement),
+ transfer = transfer}
+ end
+
+ val f =
+ Function.new {args = args,
+ blocks = Vector.map (blocks, mapBlock),
+ mayInline = mayInline,
+ name = name,
+ raises = raises,
+ returns = returns,
+ start = start}
+
+ val f = shrink f
+ in
+ f
+ end)
+
+ val () = Vector.foreach (globals, Statement.clear)
+ in
+ Program.T { datatypes = datatypes,
+ functions = functions,
+ globals = globals,
+ main = main }
+ end
+
+end
Added: mlton/trunk/mlton/ssa/combine-conversions.sig
===================================================================
--- mlton/trunk/mlton/ssa/combine-conversions.sig 2009-07-07 21:46:07 UTC (rev 7211)
+++ mlton/trunk/mlton/ssa/combine-conversions.sig 2009-07-10 16:01:09 UTC (rev 7212)
@@ -0,0 +1,21 @@
+(* Copyright (C) 2009 Wesley W. Tersptra.
+ * Copyright (C) 1999-2008 Henry Cejtin, Matthew Fluet, Suresh
+ * Jagannathan, and Stephen Weeks.
+ * Copyright (C) 1997-2000 NEC Research Institute.
+ *
+ * MLton is released under a BSD-style license.
+ * See the file MLton-LICENSE for details.
+ *)
+
+
+signature COMBINE_CONVERSIONS_STRUCTS =
+ sig
+ include SHRINK
+ end
+
+signature COMBINE_CONVERSIONS =
+ sig
+ include COMBINE_CONVERSIONS_STRUCTS
+
+ val combine: Program.t -> Program.t
+ end
Modified: mlton/trunk/mlton/ssa/simplify.fun
===================================================================
--- mlton/trunk/mlton/ssa/simplify.fun 2009-07-07 21:46:07 UTC (rev 7211)
+++ mlton/trunk/mlton/ssa/simplify.fun 2009-07-10 16:01:09 UTC (rev 7212)
@@ -15,6 +15,7 @@
structure CommonArg = CommonArg (S)
structure CommonBlock = CommonBlock (S)
structure CommonSubexp = CommonSubexp (S)
+structure CombineConversions = CombineConversions (S)
structure ConstantPropagation = ConstantPropagation (S)
structure Contify = Contify (S)
structure Flatten = Flatten (S)
@@ -77,6 +78,7 @@
{name = "localRef", doit = LocalRef.eliminate} ::
{name = "flatten", doit = Flatten.flatten} ::
{name = "localFlatten3", doit = LocalFlatten.flatten} ::
+ {name = "combineConversions", doit = CombineConversions.combine} ::
{name = "commonArg", doit = CommonArg.eliminate} ::
{name = "commonSubexp", doit = CommonSubexp.eliminate} ::
{name = "commonBlock", doit = CommonBlock.eliminate} ::
@@ -183,6 +185,7 @@
val passGens =
inlinePassGen ::
(List.map([("addProfile", Profile.addProfile),
+ ("combineConversions", CombineConversions.combine),
("commonArg", CommonArg.eliminate),
("commonBlock", CommonBlock.eliminate),
("commonSubexp", CommonSubexp.eliminate),
Modified: mlton/trunk/mlton/ssa/sources.cm
===================================================================
--- mlton/trunk/mlton/ssa/sources.cm 2009-07-07 21:46:07 UTC (rev 7211)
+++ mlton/trunk/mlton/ssa/sources.cm 2009-07-10 16:01:09 UTC (rev 7212)
@@ -67,6 +67,8 @@
global.fun
multi.sig
multi.fun
+combine-conversions.sig
+combine-conversions.fun
constant-propagation.sig
constant-propagation.fun
contify.sig
Modified: mlton/trunk/mlton/ssa/sources.mlb
===================================================================
--- mlton/trunk/mlton/ssa/sources.mlb 2009-07-07 21:46:07 UTC (rev 7211)
+++ mlton/trunk/mlton/ssa/sources.mlb 2009-07-10 16:01:09 UTC (rev 7212)
@@ -54,6 +54,8 @@
global.fun
multi.sig
multi.fun
+ combine-conversions.sig
+ combine-conversions.fun
constant-propagation.sig
constant-propagation.fun
contify.sig
More information about the MLton-commit
mailing list