cps & contification
Stephen Weeks
MLton@sourcelight.com
Tue, 16 Jan 2001 17:12:03 -0800 (PST)
> I've almost got a dominator approach worked out. I should be able to send mail
> with it later today.
Here's my next attempt. It includes a summary of the previous mail, a
definition of the continuation based analysis, and a definition of a dominator
analysis that uses the results of the continuation based analysis. I believe
that it labels as many functions as possible as contifiable.
Abstractly, a program consists of a main function and a set of tail and nontail
calls. A tail call consists of a caller and a callee. A nontail call consists
of a caller, callee, and jump.
j in Jump
f in Fun
TailCall = Fun x Fun
NonTailCall = Fun x Fun x Jump
p in Program = Fun x P(TailCall) x P (NonTailCall)
An analysis is a map from functions to the continuation that they return to.
RCont = {Uncalled, Unknown} U Jump U Fun
A in Analysis = Fun -> RCont
Here is the intended meaning of A.
A(f) meaning
-------- ------------------------------------------------
Uncalled f is never called during evaluation of the program
j in Jump f always returns to continuation j
g in Fun f always returns to function g
Unknown f returns to many places
An analysis A is safe for program p = (fm, T, N) if
* A(fm) = Unknown
** For all nontail calls (f, g, j) in N,
A(f) = Uncalled
or A(g) in {j, Unknown}
*** For all tail calls (f, g) in T,
A(f) = Uncalled
or A(g) in {f, A(f), Unknown}
Turn RCont into a lattice with the following order
Uncalled <= x
x <= x
x <= Unknown
where x is an arbitrary element of Fun U Jump.
Extend <= pointwise to make A into a lattice.
Define the function C: Analysis -> Analysis by
C(A)(g) =
if g = fm
then fm
else if exists j such that forall (f, g, j') in N. j = j'
and forall (f, g) in T. A(f) = j
then j
else g
Fact: C is monotone.
Let A' = lfp(C). A' is the continuation based analysis.
Fact: A' is safe.
Now define a second analysis, A, based on A'.
1. if A'(f) <> Unknown
then let A(f) = A'(f)
2. if A'(f) = Unknown and there exists a (g, f, j) in N
then let A(f) = Unknown
3. The only functions for which we have not defined A are only called in tail
position. Define a directed graph G = (Node, Edge) where
Node = Fun U {Root}
Edge = {(f, g) | (f, g) in T and A'(g) = Unknown}
U {(Root, f) | A'(f) <> Unknown}
U {(Root, fm)}
Consider the dominator tree of G rooted at Root.
For any f in Fun for which we have not yet defined A(f), define A(f) to be
Unknown if the immediate dominator of f is Root. Otherwise, define A(f) to be
the ancestor of f whose immediate dominator is Root.
Claim: A is safe.
Proof:
* A'(fm) = Unknown and fm's immediate dominator is Root,
hence A(fm) = Unknown.
** Let (f, g, j) be an arbitrary element of N
if A'(g) = Unknown
then A(g) = Unknown (by rule 2 in the def of A)
hence A(g) in {j, Unknown}
else A(g) = A'(g)
and if A'(f) = Uncalled
then A(f) = Uncalled
else A'(g) = j (because A' is safe)
hence A(g) = A'(g)
hence A(g) in {j, Unknown}
*** Let (f, g) be an arbitrary element of T
if A'(g) = Unknown
then (f, g) in Edge
if A(g) = Unknown
then A(g) in {f, A(f), Unknown}
else A(g) = h, the ancestor of g whose immediate
dominator is Root. Since (f, g) in Edge, either
h = f or (h <> f and h dominates f).
if h = f
then A(g) in {f, A(f), Unknown}
else h <> f and h dominates f
hence A'(f) = Unknown
hence A(f) = h
hence A(g) in {f, A(f), Unknown}
else A'(g) <> Unknown
hence A'(f) <> Unknown
and A'(f) = A'(g) (because A' is safe)
hence A(g) = A'(g) = A'(f) = A(f)
hence A(g) in {f, A(f), Unknown}