[MLton] cvs commit: world no longer contains a preprocessed basis library

Fri, 12 Dec 2003 12:12:04 -0800

> In order to recover the "tighter" behavior of the old cmcat, we would need
> to generate the graph for the necessary imports, not all the exports.  I
> don't see a very easy way of doing it.  It is easy enough to scan a graph
> to determine what symbols are imported from each imported .cm file.

Could we use the import information to do a DFS and mark only the
needed files?  Then, we could take the (in order, but too large) file
list produced by CM.Graph.graph and filter out unneeded files.

> Or, we just let MLton's removeUnused passes eliminate the dead code.  The
> two downsides of that are that (1) we might end up with extra code, (2) as
> we saw before, while CM.make "sources.cm" might succeed, the corresponding
> cmcat "sources.cm" might produce a file list with errors (unbound
> identifiers or type errors, corresponding to code that CM never compiled).

If we can't come up with an easy solution to make the new cmcat as
tight as the old, I don't think it's worth spending more time.  It
would, however, be worth explaining the situation to Matthias and
asking if he has an easy solution.  I'm happy enough with using the
bigger file list and leaving it up to MLton, which seems better than
asking people to use a really old SML/NJ.  We should try a self
compile with the bigger file list and see if it hurts.  As to the
potential errors, I think a note in the user guide is good enough.  It
should be easy enough for people to comment out their broken code, as
I did for MLton.

I think it's more valuable to spend our time thinking about the kind
of dead-code analysis that we want for mlbs.  Since we have been
living with the old cmcat, which does file-level dead-code, for years
without problems, it seem reasonable for a first cut to do a
file-level import dependence analysis on user code that will achieve
the same results as the old cmcat.  Combine that with our
non-semantics-preserving dead code stuff on the bass library.  That
will leave us no worse off than we are now in terms of code, and much
better of for living with our own mlb files.

I know that file-level isn't perfect.  But I don't think it's worth
holding up mlb files until we think of something better.  In any case,
I don't plan to start coding mlbs until January at the earliest, so
there's still lots of time for new ideas.