[MLton] Separate compilation

Matthew Fluet fluet@cs.cornell.edu
Thu, 5 Feb 2004 16:30:14 -0500 (EST)


> I am curious as to what barriers there are currently to
> implementing optional separate compilation.

True separate compilation is pretty much incompatible with MLton's
compilation strategy.  From the MLton User's Guide:

   Because MLton compiles the whole program at once, it can perform
   optimization across module boundaries. As a consequence, MLton can
   reduce or eliminate the run-time penalty that arises with separate
   compilation of SML features such as functors, modules, polymorphism,
   and higher-order functions. MLton takes advantage of having the entire
   program to perform transformations such as: defunctorization,
   monomorphisation, higher-order control-flow analysis, inlining,
   unboxing, argument flattening, redundant-argument removal, constant
   folding, and representation selection. Whole-program compilation is an
   integral part of the design of MLton and is not likely to change.

In particular, the compilation transformations of monomorphisation and
defunctionalization (which is different from defunctorization) are
impossible without having the entire program.  Furthermore, these
transformations are pretty much the first things that MLton does when
compiling, so all the subsequent optimizations depend critically upon them
being performed.

> I am aware of polymorphic
> functions, but it seems like that could be solved by compiling
> polymorphic functions to accept an extra record containing the
> necessary information for each type variable and leaving the types
> unboxed.

Actually, (in general) one needs to leave the types boxed in order to
compile polymorphic functions.

> there is the issue of functors, but it seems these could be compiled
> straightforwardly into functions taking records and producing
> records. This appears possible even as an SML -> SML
> transformation.

The core/module level distinction is two large to be overcome by an SML ->
SML transformation.  The real issue is that SML records cannot have
polymorphic elements.  So, there is no target for a polymorphic
function returned by a functor.  (Likewise, a record cannot contain a
type, so there is no target for functors defining new types.)

> This would also eliminate
> the dependence on SML/NJ as compilation wouldn't be so slow for
> development.

I think most of the developers would argue that development without SML/NJ
wouldn't be too bad (assuming a sufficiently fast machine with enough
memory).  At least for me, a fair amount of my development time is spent
fixing type errors, and the new front-end can be quite a bit faster than
SML/NJ in type-checking large programs.

Now, as fast as MLton's front-end is right now, there exists the
possibility for implementing "partial separate compilation" which could
lex, parse, elaborate, and type-check a compilation unit and save the
resulting intermediate form to disk.  (Such a technique used to be used
for the basis-library, although the front-end is fast enough that one can
type-check the whole basis in seconds.)