[MLton] MLton HOL

Stephen Weeks MLton@mlton.org
Thu, 27 May 2004 21:34:23 -0700


> The machine I'm using to compile in is a 4Gb machine, but according to
> top mlton is only using 2Gb, and my efforts to persuade it to use more
> (-fixed-heap 3500m) didn't meet with any success (it stayed at 2Gb).

Yeah, there is a problem due to address space fragmentation and the
fact that the MLton runtime requires a contiguous chunk of memory.
Since there are often some shared libraries around address 1G and
something else (I don't know what) around address 3G, about all that
mmap will give MLton's runtime is about 2G.  It's still sometimes
helpful to run on a machine with 3G or 4G though, just to avoid
paging.

> The full compiler invocation was
> 
> mlton @MLton gc-messages -- -polyvariance false -inline 100 -basis
> 1997 -verbose 3
...
> http://www.cl.cam.ac.uk/~jeh1004/psl1.sml

I tried compiling the code and get the same failure, somewhere in the
backend after the profile pass has finished, after running for about
10 minutes.  I see two problems.  First, there is a time performance
problem in the commonArg pass, which takes 348s; it should take < 5s.
That is probably some simple quadratic problem due to a linear-time
instead of constant-time lookup and should be easy to fix.  More
seriously, there is the space problem in the backend.  The Rssa (an
IL) program size is about 142M which isn't so bad.  Given that, I
wouldn't expect there to be a space problem in representing the next
IL down the line.  So, perhaps there is something quadratic going on
there.  I will investigate, but it may take some time.

I also compiled without the -polyvariance or -inline flags and got the
same failure.  Looking at the IL sizes that are displayed with
-verbose 3, we can see that these don't change much.  With the flags,
the IL size after polyvariance shrinks from 87M to 82M and the IL size
after inlining shrinks from 74M to 69M.  This is actually a good
thing, since it means that these passes, which are important for
performance, aren't introducing significant code bloat.  So, there's
no need to toy with these options.