[MLton-devel] Re: CPS vs SSA
Lal George
lg@network-speed.com
Tue, 14 Jan 2003 11:15:35 -0500
I mostly agree with the comparison made by Stephen, however despite
everything that he said, I am going to structure the new Nova compiler
using CPS once again! (Even though the final IL turns out to be first
order).
At one extreme, suppose we had a language where programs were call/cc
centric. Then clearly CPS would be a good choice. Call/cc would
be compiled to primitive function calls that would be easily
optimized with other constructs.
It turns out that compiled Nova programs look like spagetti code, and
exceptions are a nice way of capturing much of the control flow in a
high-level language. We find that programs tend to be
'exception-centric'. The domain (network processors) places a premium
on execution and so the exception handling must be compiled away as
efficiently as possible.
CPS handles this situation cleanly --- the exception handling is
expressed as simple function calls, and the handlers are nicely
bundled up in continuations. After a massive inlining phase, we get
the desired first-order program we were after. At this point one
could imagine converting to a CFG based SSA form, but there would be
little point in doing so that late in the game.
Does this make sense? Is there a better way to take an absyn to a
first order IL where exception handling is not implemented using
complex prim-ops?
Lal
------------------------------------------------------------------------
Stephen Weeks wrote:
> Lal George is currently developing Nova, a compiler for the Intel IXP.
> Nova is written in SML and uses MLRISC. I mentioned to Lal that MLton
> had switched from using a CPS IL to using an SSA IL. He had the
> following response.
>
>
>>Re: SSA -- very interesting!
>>Matthias and I had a lengthy discussion on
>>this, and came to the well known conclusion that
>>that both forms (CPS and SSA) were equally convenient
>>from an optimization point of view ... but, the
>>compactness of CPS was just so much easier to process
>>and manipulate symbolically. It seemed much easier
>>to splice trees and move variables around in a tree based
>>CPS form than it was in the traditional CFG based SSA form.
>>Since you've gone both ways, what do you think?
>
>
> In short, I see no advantages to using CPS instead of SSA, and I see
> several disadvantages. Note, this is in the context of a first-order
> IL, which is the case in MLton, and I assume in Nova. It's also in
> the context of MLton's SSA IL, which uses a more functional notation
> instead of traditional phi-functions.
>
> What is the difference between CPS and SSA? In both, a function
> consists of a collection of basic blocks (called continuations in
> CPS). The difference is that in CPS, there is a syntax tree in which
> the blocks are nested, while in SSA, the blocks are in a flat vector
> (and describe a control-flow graph CFG). So, what does the syntax
> tree in CPS do?
>
> 1. All of the blocks appear exactly once in the tree. So a recursive
> walk over the tree visits all blocks and variables once.
>
> 2. It satisfies the lexical scoping condition for variables: a
> preorder walk over the tree is guaranteed to visit variable
> definitions before uses.
>
> 3. It satisfies the lexical scoping condition for blocks: a preorder
> walk over the tree visits block definitions before uses.
>
> 4. It provides an approximation to the dominator relation. If block A
> is nested within block B, then B dominates A. It is only
> approximation, however. It may be that B dominates A but A is not
> nested within B.
>
> So, transformations on CPS can use recursive tree walks to accomplish
> their tasks. Let's see how we accomplish the same things in SSA.
>
> 1. All of the blocks appear exactly once in the vector. So, a loop
> over the vector visits all blocks and variables once.
>
> 2. The variables satisfy the dominator condition: a definition of a
> variable dominates all of its uses. So, a preorder DFS over the CFG
> visits variable definitions before uses.
>
> 3. A postorder DFS over the CFG visits block definitions before uses.
>
> 4. Once can compute the dominator tree on the CFG in almost linear
> time.
>
> So, transformations on SSA use either a loop over all the blocks, a
> DFS of the CFG, or a tree walk of the dominator tree, depending on
> what's appropriate.
>
> Since DFS and dominator tree computation are pretty easy to write, and
> can be in libraries, I consider CPS and SSA of equivalent complexity
> from the perspective of an input to some transformation. The same
> information is easily accessible in either.
>
> However, in terms of a transformation producing output, I find that
> producing CPS is more complex than SSA. This is because CPS imposes
> additional constraints that must be *explicitly* in the IL that SSA
> does not impose: namely that the syntax tree must provide a sufficient
> approximation to the dominator tree that a simple tree walk be able
> to prove variable definitions dominate uses and block uses nest within
> block uses.
>
> Thinking of it another way, both CPS and SSA require that variable
> definitions dominate uses. The difference is that using CPS as an IL
> requires that all transformations provide a proof of dominance in the
> form of the nesting, while SSA doesn't. Now, if a CPS transformation
> doesn't do too much rewriting, then the partial dominance information
> that it had from the input tree is sufficient for the output tree.
> Hence tree splicing works fine. However, sometimes it is not
> sufficient.
>
> As a concrete example, consider common-subexpression elimination.
> Suppose we have a common subexpression x = e that dominates an
> expression y = e in a function. In CPS, if y = e happens to be within
> the scope of x = e, then we are fine and can rewrite it to y = x. If
> however, y = e is not within the scope of x, then either we have to do
> massive tree rewriting (essentially making the syntax tree closer to
> the dominator tree) or skip the optimization. Another way out is to
> simply use the syntax tree as an approximation to the dominator tree
> for common-subexpression elimination, but then you miss some
> optimization opportunities. On the other hand, with SSA, you simply
> compute the dominator tree, and can always replace y = e with y = x,
> without having to worry about providing a proof in the output that x
> dominates y (i.e. without putting y in the scope of x)
>
> As another example, when we used to use CPS in MLton, we ran into
> problems with our contification optimization in trying to satisfy the
> CPS condition that block uses occur within the scope of their
> definitions. We were forced to do some pretty messy tree rewriting,
> and still had to disable some cases. When we switched to SSA, we
> didn't have to worry about scoping -- we simply put all the blocks
> together and left it to later passes to do a DFS or compute dominators
> if necessary.
>
>>From the perspective of typed ILs, I view CPS as pushing some of the
> work of the IL typechecker into each pass, by requiring each
> transformation to produce the scoping information necessary to make it
> easy for the type checker that variable definitions dominate uses. On
> the other hand, with SSA, the transformations don't have to do any
> extra work, while the type checker has to do more, in that it has to
> do a dominator computation. So, with CPS, we complicate many
> transformations to make the type checker simple, while with SSA, we
> complicate one type checker to make many passes simpler.
>
> Oh well, that was a long explanation. I hope it made some sense. We
> have used CPS in MLton for over three years, from fall 1998 to fall
> 2001, and have used SSA for over a year now. We are quite happy that
> we switched.
>
> I've added a little more detail below. Feel free to skip it.
>
> To be extremely clear on what I mean by CPS and SSA, here are
> simplified grammars for first-order CPS and SSA ILs similar to what
> was/is in MLton. In the grammars, I'll use the following
> metavariables.
>
> c in Const
> l in Label
> x in Var
> f in Func
> p in Prim
>
> Here is the CPS grammar, with comments on the right.
>
> <program> ::= <fun>*
> <fun> ::= fun <f> (<x>*) = <exp> toplevel function declaration
> <exp> ::= let <dec>* in <transfer> end
> <dec> ::= <bind>
> | <block>
> <block> ::= block <l> (<x>*) = <exp> continuation declaration
> <bind> ::= val <x> = <simple> simple expression
> <simple> ::= <c> constant
> | <x> variable
> | <p> (<x>*) primitive application
> <transfer> ::= <f> (<x>*) tail call
> | <l> (<f> (<x>*)) nontail call
> | <l> (<x>*) jump
> | <x>* return
>
> So, a program is a collection of mutually recursive functions.
> Functions take multiple arguments and return multiple values. The
> body expression of a function is in CPS form (it may be a bit hard to
> see that at first). Continuations (labels) are declared with a
> "block" keyword, so chosen because of the similarity with SSA basic
> blocks. The usual lexical scoping rules are in place -- a variable
> must be bound before it is used, and a continuation must be defined
> before it is jumped to or used in a nontail call.
>
> I've tweaked the CPS grammar so that it is very easy to compare with
> the SSA grammar. All we have to do is to replace
>
> <fun> ::= fun <f> (<x>*) = <exp>
> <exp> ::= let <dec>* in <transfer> end
> <dec> ::= <bind>
> | <block>
>
> with
>
> <fun> ::= fun <f> (<x>*) = let <block>* in <l> () end
> <exp> ::= let <bind> in <transfer> end
>
> and voila, now we have SSA, with the usual dominance condition.
>
> Hopefully this makes very precise what I mean when I say that the
> difference between CPS and SSA is the nesting of blocks.
>
>
>
-------------------------------------------------------
This SF.NET email is sponsored by: FREE SSL Guide from Thawte
are you planning your Web Server Security? Click here to get a FREE
Thawte SSL guide and find the answers to all your SSL security issues.
http://ads.sourceforge.net/cgi-bin/redirect.pl?thaw0026en
_______________________________________________
MLton-devel mailing list
MLton-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mlton-devel