new SSA IL
Stephen Weeks
MLton@sourcelight.com
Tue, 7 Aug 2001 19:21:24 -0700
Matthew and I have been kicking around the idea of replacing the CPS IL with a
new IL that is closer to SSA. Attached is my first attempt at a signature for
such an IL. The main difference between CPS and SSA is that local functions are
not nested and there is no variable scoping. Instead, all local functions are
mutually recursive, and the type system requires that a variable definition
dominate all of its uses.
There are several reasons for the change.
* Many optimizations use control-flow-graph dominators and variable liveness
information. Our current IL has lexical scoping, which provides an
approximation to both of these. That is, a variable binding certainly
dominates all of the expressions in the lexical scope of the binding, but may
dominate more. Also, a variable is certainly only live in its lexical scope,
but it may be live less. It would be better (i.e. enable more optimizations)
if we used the exact notions of dominance and liveness instead of the
approximation provided by lexical scoping. It would also enable
transformations to worry only about satisfying dominance condition instead of
scoping constraints.
* We would like to express more optimizations on handlers. HandlerPush and
HandlerPop are too restrictive. This is orthogonal to the SSA change, but
makes sense to do at the same time.
* It should make syntax trees smaller.
* It should make optimization passes faster, since they won't have to recur down
syntax trees -- instead they can loop over blocks.
The expected migration path is to write a CPS -> SSA translator, rework the
backend to use SSA instead of CPS, and then to incrementally rewrite each CPS
simplification pass in the optimization pipeline from back to front until
finally reaching the closure converter. In the meantime, we'd have code for two
ILs around, and duplicated code for any optimization pass that lived in both
worlds.
After you take a look at the signature below, here are some design questions and
the way I am leaning.
* Should the common Var.t vector be lifted out of Exp.t and Transfer.t?
It would save syntax tree space, but you lose a little bit of compile-time
checking (like Select only taking one arg). I lean very slightly towards
not lifting.
* What should be done about case branches and value carrying constructors?
Should we stick with the current scheme, in which all selects are done
implicitly by the case, or go back to the old scheme, in which the case just
narrows the value and there is a separate ConSelect expression? I lean
towards leaving as is, and revisiting the issue after the conversion is
complete.
* Should we create separate classes of labels, Handler.t and Cont.t?
If we did this, SetHandler would now take a Handler and nontail Call would
take a Cont. This would give the necessary hooks to the backend for
generating the wrapper blocks, but I'm not sure if it helps anywhere else. I
lean against doing this.
* Should raise only take one arg?
I don't think we have ever used the fact that raise can take multiple
arguments. It would shrink the syntax trees slightly and simplify some passes
to remove this generality. Again, I lean towards leaving as is and revisiting
once the conversion is complete.
Please send any thoughts you have or suggestions for the IL.
--------------------------------------------------------------------------------
type int = Int.t
type word = Word.t
signature SSA_TREE_STRUCTS =
sig
include ATOMS
end
signature SSA_TREE =
sig
include SSA_TREE_STRUCTS
structure Type:
sig
include HASH_TYPE
datatype dest =
Char
| Int
| IntInf
| Pointer
| Word
| Word8
| Real
| String
| Thread
| Array of t
| Ref of t
| Datatype of Tycon.t
| Tuple of t vector
| Vector of t
end
sharing Atoms = Type.Atoms
structure Func: HASH_ID
structure Label: HASH_ID
structure PrimInfo:
sig
datatype t =
None
| Overflow of Label.t (* Must be nullary. *)
end
structure Exp:
sig
datatype t =
ConApp of {con: Con.t,
args: Var.t vector}
| Const of Const.t
| PrimApp of {prim: Prim.t,
info: PrimInfo.t,
targs: Type.t vector,
args: Var.t vector}
| RestoreExnStack
| SaveExnStack
| Select of {tuple: Var.t,
offset: int}
| SetHandler of Label.t
| Tuple of Var.t vector
| Var of Var.t
end
structure Statement:
sig
datatype t = T of {var: Var.t option, (* NONE iff ty = unit. *)
ty: Type.t,
exp: Exp.t}
end
structure Cases: CASES sharing type Cases.con = Con.t
structure Transfer:
sig
datatype t =
Bug (* MLton thought control couldn't reach here. *)
| Call of {
func: Func.t,
args: Var.t vector,
cont: Label.t option (* NONE means tail-call *)
}
| Case of {
test: Var.t,
cases: Label.t Cases.t,
default: Label.t option (* Must be nullary. *)
}
| Goto of {
dst: Label.t,
args: Var.t vector
}
| Raise of Var.t vector
| Return of Var.t vector
end
structure Block:
sig
datatype t =
T of {
label: Label.t,
args: (Var.t * Type.t) vector,
statements: Statement.t vector,
transfer: Transfer.t
}
end
structure Function:
sig
datatype t =
T of {
name: Func.t,
args: (Var.t * Type.t) vector,
start: Label.t, (* Must be nullary. *)
blocks: Block.t vector,
returns: Type.t vector
}
end
structure Program:
sig
datatype t =
T of {
datatypes: {
tycon: Tycon.t,
cons: {
con: Con.t,
args: Type.t vector
} vector
} vector,
globals: Statement.t vector,
functions: Function.t vector,
main: Func.t (* Must be nullary. *)
}
end
end