cps & contification
Stephen Weeks
MLton@sourcelight.com
Wed, 17 Jan 2001 11:08:36 -0800 (PST)
> Changing C to the following restores safety:
>
> C(A)(g) =
> if g = fm
> then Unknown
> else if exists j such that forall (f, g, j') in N. j = j'
> and forall (f, g) in T. A(f) = j
> then j
> else Unknown
Agreed. Thanks for the bug fixes, and the proof, which (correct me if I'm
wrong) showed that any fixed point of C is safe (i.e. it did not depend on least
fixed point).
> > Now define a second analysis, A, based on A'.
...
> The problem seems to be that we really wanted an edge (Root, f). I don't
> quite understand the whole definition of Edge
...
> It seems to me that we want edges (Root, f) for everything that we know
> must be Unknown. Would the following definition make sense:
>
> Edge = {(f,g) | (f,g) in T and A'(g) = Unknown}
> U {(Root, f) | A'(f) = Unknown and exists (g, f, j) in N}
> U {(Root, fm)}
>
> i.e., roll rule 2 into the definition of G.
I agree with your points. The intention was certainly that the graph be
connected with all nodes reachable from root. The edges you add make sense.
The reason for the edges "{(Root, f) | A'(f) <> Unknown}" is to make the graph
connected when A' has already figured out stuff. For example, consider the
program
( fm,
{(fm, f, K1), (fm, g, K2)},
{(f, h), (g, h)})
For this program,
A'(fm) = Unknown
A'(f) = K1
A'(g) = K2
A'(h) = Unknown
So we want the set of edges to be
{(f, h), (g, h)}
U {(Root, f), (Root, g)}
U {}
U {(Root, fm)}
So the immediate dominator of h is root, and we conclude A(h) = Unknown.
Here is my new proposal for the analysis, including your bugfixes and proof. I
also cleaned up the definition of A and proof of its safety.
--------------------------------------------------------------------------------
Abstractly, a program consists of a main function and a set of tail and nontail
calls. A tail call consists of a caller and a callee. A nontail call consists
of a caller, callee, and jump.
j in Jump
f in Fun
TailCall = Fun x Fun
NonTailCall = Fun x Fun x Jump
p in Program = Fun x P(TailCall) x P (NonTailCall)
An analysis is a map from functions to the continuation that they return to.
RCont = {Uncalled, Unknown} U Jump U Fun
A in Analysis = Fun -> RCont
Here is the intended meaning of A.
A(f) meaning
-------- ------------------------------------------------
Uncalled f is never called during evaluation of the program
j in Jump f always returns to continuation j
g in Fun f always returns to function g
Unknown f returns to many places
An analysis A is safe for program p = (fm, T, N) if
* A(fm) = Unknown
** For all nontail calls (f, g, j) in N,
A(f) = Uncalled
or A(g) in {j, Unknown}
*** For all tail calls (f, g) in T,
A(f) = Uncalled
or A(g) in {f, A(f), Unknown}
Define the function C: Analysis -> Analysis by
C(A)(g) =
if g = fm
then Unknown
else if exists j such that forall (f, g, j') in N. j = j'
and forall (f, g) in T. A(f) = j
then j
else Unknown
Claim: if A = C(A) then A is safe.
Proof:
* A(fm) = C(A)(fm)
= if fm = fm
then Unknown
else ...
= Unknown
** Let (F, G, J) be an arbitrary element of N
Suppose A(F) = Uncalled.
Then done.
Suppose A(F) <> Uncalled.
Then A(G) = C(A)(G)
= if G = fm
then Unknown
else if exists j such that forall (f, g, j') in N. j = j'
and forall (f, g) in T. A(f) = j
then j
else Unknown
in {J, Unknown}
subset {J, Unknown}
*** Let (F, G) be an arbitrary element of T
Suppose A(F) = Uncalled.
Then done.
Suppose A(F) <> Uncalled.
Then A(G) = C(A)(G)
= if G = fm
then Unknown
else if exists j such that forall (f, g, j') in N. j = j'
and forall (f, g) in T. A(f) = j
then j
else Unknown
in {A(F), Unknown}
subset {F, A(F), Unknown}
Turn RCont into a lattice with the following order
Uncalled <= x
x <= x
x <= Unknown
where x is an arbitrary element of Fun U Jump.
Extend <= pointwise to make A into a lattice.
Fact: C is monotone.
Let A' = lfp(C).
Now define a directed graph G = (Node, Edge) where
Node = Fun U {Root}
Edge = {(f, g) | (f, g) in T and A'(g) = Unknown}
U {(Root, f) | A'(f) <> Unknown}
U {(Root, f) | A'(f) = Unknown and exists (g, f, j) in N}
U {(Root, fm)}
Claim: For all f in Fun, there is a path in G from Root to f.
Let D be the dominator tree of G rooted at Root.
For f in Fun, let parent(f) be the parent of f in D.
Define an analysis, A, based on A' and D as follows:
A(f) =
if A'(f) <> Unknown
then A'(f)
else if parent(f) = Root
then Unknown
else the ancestor g of f in D such that parent(g) = Root
Claim: A is safe.
Proof:
First note that since A' is a fixed-point of C, A' is safe. Now, take each
clause of safety in turn.
* A(fm) = if A'(fm) <> Unknown
then A'(fm)
else if parent(fm) = Root
then Unknown
else the ancestor g of f in D such that parent(g) = Root
= if parent(fm) = Root
then Unknown
else the ancestor g of f in D such that parent(g) = Root
= Unknown
** Let (f, g, j) be an arbitrary element of N
if A'(g) = Unknown
then A(g) = Unknown because (Root, g) in Edge
hence A(g) in {j, Unknown}
else A(g) = A'(g)
if A'(f) = Uncalled
then A(f) = Uncalled
else A'(g) = j (because A' is safe)
hence A(g) = A'(g)
hence A(g) in {j, Unknown}
*** Let (f, g) be an arbitrary element of T
if A'(g) = Unknown
then (f, g) in Edge
if parent(g) = Root
then A(g) = Unknown in {f, A(f), Unknown}
else A(g) = h, the ancestor of g in D with
parent(h) = Root. Since (f, g) in Edge,
h dominates f.
if h = f
then A(g) = h = f in {f, A(f), Unknown}
else A'(f) = Unknown (since h dom f)
hence, A(f) = h
hence A(g) = h = A(f)
in {f, A(f), Unknown}
else A'(g) <> Unknown
hence A'(f) <> Unknown
and A'(f) = A'(g) (because A' is safe)
hence A(g) = A'(g) = A'(f) = A(f)
hence A(g) in {f, A(f), Unknown}