cps & contification
Matthew Fluet
fluet@CS.Cornell.EDU
Thu, 18 Jan 2001 11:31:17 -0500 (EST)
Don't worry, I didn't find any new bugs. Here's an updated description of
the whole analysis. I updated the proof that any fixed point of C is safe
with the new definition of C. I also added a proof of the claim that
A'(f) = Uncalled iff there does not exist a path from fm to f. Finally, I
ran through all of the problematic contification examples from the past
emails, and they all work out the way we would like.
*****************************************************************************
*****************************************************************************
Abstractly, a program consists of a main function and a set of tail and nontail
calls. A tail call consists of a caller and a callee. A nontail call consists
of a caller, callee, and jump.
j in Jump
f in Fun
TailCall = Fun x Fun
NonTailCall = Fun x Fun x Jump
p in Program = Fun x P(TailCall) x P (NonTailCall)
An analysis is a map from functions to the continuation that they return to.
RCont = {Uncalled, Unknown} U Jump U Fun
A in Analysis = Fun -> RCont
Here is the intended meaning of A.
A(f) meaning
-------- ------------------------------------------------
Uncalled f is never called during evaluation of the program
j in Jump f always returns to continuation j
g in Fun f always returns to function g
Unknown f returns to many places
An analysis A is safe for program p = (fm, T, N) if
* A(fm) = Unknown
** For all nontail calls (f, g, j) in N,
A(f) = Uncalled
or A(g) in {j, Unknown}
*** For all tail calls (f, g) in T,
A(f) = Uncalled
or A(g) in {f, A(f), Unknown}
Turn RCont into a lattice with the following order
Uncalled <= x
x <= x
x <= Unknown
where x is an arbitrary element of Fun U Jump.
Extend <= pointwise to make A into a lattice.
Define the function C: Analysis -> Analysis by
C(A)(g) =
if g = fm
then Unknown
else lub ({j | (f, g, j) in N and A(f) <> Uncalled}
U {A(f) | (f, g) in T})
Claim: C is monotone and continuous.
Proof: Clear.
Claim: if A = C(A) then A is safe.
Proof:
* A(fm) = C(A)(fm)
= if fm = fm
then Unknown
else ...
= Unknown
** Let (F, G, J) be an arbitrary element of N
Suppose A(F) = Uncalled.
Then done.
Suppose A(F) <> Uncalled.
Then A(G) = C(A)(G)
= if G = fm
then Unknown
else lub ({j | (f, g, j) in N and A(f) <> Uncalled}
U {A(f) | (f, g) in T})
= if G = fm
then Unknown
else J \/ lub ({j | (f, g, j) in N and A(f) <> Uncalled}
U {A(f) | (f, g) in T})
in {J, Unknown}
subset {J, Unknown}
*** Let (F, G) be an arbitrary element of T
Suppose A(F) = Uncalled.
Then done.
Suppose A(F) <> Uncalled.
Then A(G) = C(A)(G)
= if G = fm
then Unknown
else lub ({j | (f, g, j) in N and A(f) <> Uncalled}
U {A(f) | (f, g) in T})
= if G = fm
then Unknown
else A(F) \/ lub ({j | (f, g, j) in N and A(f) <> Uncalled}
U {A(f) | (f, g) in T})
in {A(F), Unknown}
subset {F, A(F), Unknown}
Corollary: if A = C(A) then A satisfies:
***' for all tail calls (f, g) in T,
A(f) = Uncalled
or A(g) in {A(f), Unknown}
Proof: By examination of case *** of previous proof.
Let A' = lfp(C).
Fact: A' is safe (because A' = C(A')).
Claim: A'(f) = Uncalled iff
there does not exist a path of calls from fm to f.
Proof:
==> (prove contrapositive: If there exists a path of calls from fm to f,
then A'(f) <> Uncalled.)
Suppose there exists a path of calls from fm to f.
Proceed by induction on the length of the path.
n = 0: Then f = fm.
A'(f) = Unknown (because A' is safe).
Hence A'(f) <> Uncalled.
n > 0: Then there exists g
such that there exists a path of n-1 calls from fm to g
and (g, f) in T or (g, f, j) in N.
By induction hypothesis, A'(g) <> Uncalled.
Suppose (g, f) in T.
A'(f) in {g, A'(g), Unknown} because A' is safe.
Hence A'(f) <> Uncalled.
Suppose (g, f, j) in N.
A'(f) in {j, Unknown} because A' is safe.
Hence A'(f) <> Uncalled.
<==
Suppose there does not exist a path of calls from fm to f.
Let Caller*(f) be the transitive closure of the set of callers of f.
Then forall g in Caller*(f).
there does not exist a path of calls from fm to g
and
forall (h, g, j) in N. h in Caller*(f)
and
forall (h, g) in T. h in Caller*(f).
Let B(g) = if g in Caller*(f) then Uncalled else Unknown.
Suppose g in Caller*(f).
Then C(B)(g) = if g = fm
then Unknown
else lub ({j | (h, g, j) in N and B(h) = Uncalled}
U {A(h) | (h, g) in T})
= lub ({j | (h, g, j) in N and B(h) = Uncalled}
U {A(h) | (h, g) in T})
(because there does not exist a path of calls from fm to g)
= Uncalled
(because forall (h, g, j) in N. h in Caller*(f)
and B(h) = Uncalled
and forall (h, g) in T. h in Caller*(f)
and B(h) = Uncalled)
= B(g)
(because g in Caller*(f)).
Suppose g notin Caller*(f).
Then C(B)(g) <= Unknown
(upper bound)
= B(g)
(because g notin Caller*(f)).
Hence, C(B) <= B.
Therfore, lfb(C) <= B (by Fixed-Point Theorem).
Hence, forall g in Caller*(f). A'(f) = lfb(C)(g) = Uncalled.
Hence, A'(f) = Uncalled.
Now define a directed graph G = (Node, Edge) where
Node = Fun U {Root}
Edge = {(f, g) | (f, g) in T and A'(g) = Unknown}
U {(Root, f) | A'(f) <> Unknown}
U {(Root, f) | A'(f) = Unknown and exists (g, f, j) in N}
U {(Root, fm)}
Claim: For all f in Fun, there is a path from Root to f in G.
Proof:
If A'(f) = <> Unknown, then (Root, f) in Edge (by second clause).
If A'(f) = Unknown,
then there is a path of calls from fm to f
(because A' = lfp(C) and A'(f) <> Uncalled).
If the last call on this path is a nontail call (g, f, j),
then (Root, f) in Edge (by third clause).
Otherwise, consider the longest suffix of this path that consists
of all tail calls: (f0, f1) (f1, f2) ... (fn-1, f)
If forall i, A'(fi) = Unknown,
then there is a path from f0 to f in G (by first clause).
If f0 = fm,
then (Root, f0) in Edge (by fourth clause)
and there is a path from Root to f in G.
If f0 <> fm,
then there exists (g, f0, j) in N
and (Root, f0) in Edge (by third clause)
and there is a path from Root to f in G.
If exists k, A'(fk) <> Unknown and forall i > k, A'(fi) = Unknown
then there is a path from fk+1 to f in G (by first clause)
and (Root, fk) in Edge (by second clause)
and there is a path from Root to f in G.
Let D be the dominator tree of G rooted at Root.
For f in Fun, let parent(f) be the parent of f in D.
Define an analysis, A, based on A' and D as follows:
A(f) =
if A'(f) <> Unknown
then A'(f)
else if parent(f) = Root
then Unknown
else the ancestor g of f in D such that parent(g) = Root
Claim: A is safe.
Proof:
* A(fm) = if A'(fm) <> Unknown
then A'(fm)
else if parent(fm) = Root
then Unknown
else the ancestor g of f in D such that parent(g) = Root
= if parent(fm) = Root
then Unknown
else the ancestor g of f in D such that parent(g) = Root
= Unknown
** Let (f, g, j) be an arbitrary element of N
if A'(g) = Unknown
then A(g) = Unknown because (Root, g) in Edge (by third clause)
hence A(g) in {j, Unknown}
else A(g) = A'(g)
if A'(f) = Uncalled
then A(f) = Uncalled
else A'(g) = j (because A' is safe and A'(f) <> Uncalled)
hence A(g) = A'(g)
hence A(g) in {j, Unknown}
*** Let (f, g) be an arbitrary element of T
if A'(g) = Unknown
then (f, g) in Edge (by first clause)
if parent(g) = Root
then A(g) = Unknown
hence A(g) in {f, A(f), Unknown}
else A(g) = h, the ancestor of g in D with parent(h) = Root.
if h = f
then A(g) = h = f
hence A(g) in {f, A(f), Unknown}
else A'(f) = Unknown (because h dominates f)
hence A(f) = h
hence A(g) = h = A(f)
hence A(g) in {f, A(f), Unknown}
else A'(g) <> Unknown
hence A'(g) in {A'(f)} (by corollary)
hence A(g) = A'(g) = A'(f) = A(f)
hence A(g) in {f, A(f), Unknown}
*****************************************************************************
*****************************************************************************
fun f () = let ... in loop () end
fun loop () = let ... in loop () end
fun g () = ... K1 (f ()) ... K2 (f ()) ...
(g,
{(g, f, K1), (g, f, K2)},
{(f, loop), (loop, loop)})
A'(f) = Unknown
A'(loop) = Unknown
A'(g) = Unknown
Edge =
{(f, loop), (loop, loop)}
U {}
U {(Root, f)}
U {(Root, g)}
A(f) = Unknown
A(loop) = f
A(g) = Unknown
*****************************************************************************
fun f () = ... g () ... g () ...
fun g () = ... f () ...
fun h () = ... K (f ()) ...
(h,
{(h, f, K)},
{(f, g), (g, f)})
A'(f) = K
A'(g) = K
A'(h) = Unknown
Edge =
{} U
{(Root, f), (Root, g)} U
{}
{(Root, h)}
A(f) = K
A(g) = K
A(h) = Unknown
*****************************************************************************
(Note: this is essentially the concat_1 example that is always missed
by the present call-based and continuation-based analyses.)
fun f () = ... g () ... g () ...
fun g () = ... f () ....
fun h () = ... K1 (f ()) ... K2 (f ()) ...
(h,
{(h, K1, f), (h, K2, f)},
{(f, g), (g, f)})
A'(f) = Unknown
A'(g) = Unknown
A'(h) = Unknown
Edge =
{(f, g), (g, f)}
U {}
U {(Root, f)}
U {(Root, h)}
A(f) = Unknown
A(g) = f
A(h) = Unknown
*****************************************************************************
fun f () = ... h () ... g () ...
fun g () = ... h () ... f () ...
fun h () = ...
fun i () = ... K (f ()) ...
(i,
{(i, f, K)},
{(f, h), (f, g), (g, h), (g, f)})
A'(f) = K
A'(g) = K
A'(h) = K
A'(i) = Unknown
Edge =
{}
U {(Root, f), (Root, g), (Root, h)}
U {}
U {(Root, i)}
A(f) = K
A(g) = K
A(h) = K
A(i) = Unknown
*****************************************************************************
fun h () = ... K (i ()) ... L (f ()) ...
fun f () = ... g () ...
fun g () = ...
fun i () = ... f () ...
(h,
{(h, i, K), (h, f, L)},
{(f, g), (i, f)})
A'(h) = Unknown
A'(f) = Unknown
A'(g) = Unknown
A'(i) = K
Edge =
{(f, g), (i, f)}
U {(Root, i)}
U {(Root, f)}
U {(Root, h)}
A(h) = Unknown
A(f) = Unknown
A(g) = f
A(i) = K
*****************************************************************************