SSA simplify passes

Fri, 4 Jan 2002 17:51:12 -0800

> md5 -- if you look at the .ssa of -loop-passes 1, you'll see three
> functions all of the form:
> fun x_148 (x_154) = L_88 ()
>   L_88 ()
>     ()
> which are used in about 45 non-tail calls.  Before the last removeUnused
> call, these functions do somthing, but their results aren't used, so they
> are trimmed down to:
...
> which the shinker reduces to the above (this explains why removeUnused
> didn't eliminate the function argument, although it is clearly unused in
> the final function).
> 
> I think that just eliminating those extraneous non-tail calls sped it up
> quite a bid.

Cool.  Maybe what we need is some count of how many opts the last
shrink did, and if there werw a lot done, we re-run the simplifier.

> matrix-multiply: As best I can make out, something (I don't know what
> pass) establishes the fact that the source matrix is initialized, but
> never updated -- so it replaces all the array subs (and associated
> computation of subscripts) with the constant 1.0.

Constant propagation figures this out.  The reason it didn't with
-loop-passes 1 is that inlining happens after constant propagation, so
the Array2.array constructor saw many different arrays.  After
inlining, each array construction is a different call to Array_array,
it is easy for constant propagation to see that there is an array of
constants.  Because of that, I don't think that's a good benchmark, so
I changed it so that the array is nonconstant.

> nucleic: Looking at the datatypes of the -loop-passes 2 version, my guess
> is too much flattening of tuples.  In particular, something like this
> looks bad:
...

Yuck.  Maybe arg flattening should only happen if a large fraction of
the selects are inevitable.

> btw, how does MLton map a record to a tuple?.

The translation to XML (type-inference/infer.fun) puts record fields
in alphabetical order (of course :-) when translating to a tuple.

> Here are a couple of other random "cleanup"'s I'm trying to track down:
...
> Pulling apart x_1991 just to put it back together again in env_30 seems
> wasteful.  This happens all over the place in a self-compile.

Yeah, eliminating reconstitution of tuples has been on my list for a
while.  Go for it.

> Another random thing I noticed: there seems to be a strange
> interplay between the XML simplifier (particularly inlining) and
> polyvariance. If you look at the SSA of logic.sml you'll see 5
> copies of a function called oc_?, all of which are alpha-equivalent.
> ...

You're right.  The XML simplifier only inlines functions that are
called once.  So, occurs_check is inlined inside of unify_REF.  Hence
oc and ocs are inside unify_REF by the time we get to polyvariance.
unify_REF is "higher-order", and so is potentially duplicated by
polyvariance.  It turns out there are 5 uses, and the size threshold
is met, so there are 5 copies of oc and ocs.

The problem is that polyvariance duplicates functions local to a
higher-order function when the function is duplicated, even if they
don't depend on the higher-order-ness.  The solution as I see it is to
some kind of lambda-lifting/closure conversion to move these local
functions out so that only the code that depends on the
higher-order-ness is duplicated.  There's a whole lot of unexplored
territory in MLton in lifting functions and improving closure
representations.  I don't see any easy fixes.  Just lots of
(interesting) work.