MLton for Win32...
Stephen Weeks
sweeks@intertrust.com
Wed, 12 Jul 2000 15:31:20 -0700 (PDT)
> And I have recompiled a new mlton.c too in this
> distribution. It seems to require 256MB memory +
> 256MB swap space to compile MLton. And I've
> seen that it's about 50.000 lines of SML code.
> The game I was working on a few weeks ago was
> about 5-6.000 lines of SML code and it could
> barely compile on 128MB memory + 128MB swap space.
When I recompile MLton, I always use the max-heap runtime system
argument in the bin/mlton script. Lately, I've been setting it at
max-heap 350m (my machine has 512M RAM), which causes about 35% of
compile time to be spent in GC. You might try setting max-heap 250m
to see if you get better behavior, since frequent GC's don't interact
well with paging.
> However, when compiling MLton I noticed (when
> running top) that during the last parts of
> compilation it took only about 5% CPU ressources.
> Do you think this is because of bad memory
> locality of reference - maybe because of
> garbage collection?
Probably.
> Because I fear that
> this might be a problem if I am going to
> write 3D software - which typically also
> uses lots of memory.
Yes, for your stuff to run, we're probably either gonna have to do
some hacks to put memory out of reach of the GC, or write a better
(i.e. generational) GC. The former may be relatively easy -- we'll
have to wait and see. The latter is not really my area of expertise,
and I'd love to find someone else to do it. But if no one does, I may
get around to it someday (I would be surprised if it happens this
year, though).
> But other than that it seems that MLton really
> kicks ass :)
> What compiler technology is used in it? Which
> optimization passes do you have?
Thanks. The main compiler technology is whole-program optimization,
which enables many specific optimizations that can't be done with
separate compilation. There are *many* optimization passes in MLton.
Here is an overview of some, in the order in which they occur, and
where the source code for the optimization is:
* dead code elimination (core-ml/dead-code.fun)
Remove parts of the basis library that aren't used.
* defunctorization (elaborate/elaborate.fun)
Duplicates the body of every functor each time it is applied.
* monomorphisation (xml/monomorphise.fun)
Duplicates each polymorphic function for every type at which it is used.
* globalization (closure-convert/{closure-convert,globalize.fun})
Move values that are constants out to the top level so they are only
evaluated once, and so that they do not appear in closures.
* flow analysis (closure-convert/flow-analysis.fun)
Figure out what function(s) is called at every call-site. Replace
unknown calls by known calls or case statements.
* constant folding (cps/shrink.fun, atoms/prim.fun)
Reduce primitive applications to known args (e.g. 1 + 2) at compile time
* shrink reductions (cps/shrink.fun)
A whole famile of compile-time reductions, like
o #1(a, b) --> a
o case C x of C y => e --> let y = x in e
o inline functions only called in one place
* remove unused functions, globals, and constructors (cps/remove-unused-*.fun)
* inlining (cps/inline.fun)
Inline small functions.
* constant propagation (cps/constant-propagation.fun)
This is whole-program constant propagation, even through data structures.
* useless elimination (cps/useless.fun)
Remove useless arguments to functions and useless components of data
structures.
* convert raises to jumps (cps/raise-to-jump.fun)
Convert raise statements where it can be statically inferred what the
handler will be into jumps.
* loop invariant removal (cps/loop-invariant.fun)
Move loop invariants outside of loops.
* flattening
Flatten out tuple arguments to functions and constructors into n-ary
functions and constructors.
* redundant argument removal
Remove arguments to functions that are always the same as each other.
* efficient datatype representation selection
Choose representations for datatypes that attempt to minimize heap
allocation.
* gcc optimizations
BTW, you may be interested to know that a new release of MLton is
imminent. It fixes all of the bugs that you mention on your web
page. You may want to integrate your stuff with that version.