[MLton] memory usage on darwin

Matthew Fluet fluet at tti-c.org
Thu Aug 2 15:59:01 PDT 2007


In "An Experimental Analysis of Self-Adjusting Computation" 
(http://www.mlton.org/References#AcarEtAl06), the authors state:

   We ran our experiments on a 2.7GHz Power Mac G5 with 4GB of memory.  We
   compile our benchmarks with the MLton compiler using
   "-runtime ram-slop 1 gc-summary" options. The "ram-slop 1" option
   directs the run-time system to use all the available memory on the
   system -- MLton, however, can allocate a maximum of about two Gigabytes.

I had assumed that the 2GB limit was a consequence of MLton allocating a 
contiguous heap; that is, with virtual address space fragmentation, one 
might not be able to allocate more than a 2GB heap.

However, in cleaning up the total RAM computation in light of the 
ambiguities in the "sysctl" interface, I was looking at the <sys/sysctl.h> 
header on various Darwin (Mac OS X) and *BSD systems.  On some systems 
(including Darwin), the "hw.physmem" control has type "CTLTYPE_INT", which 
is interpreted as a "signed int".

It seems that on Darwin, when the machine has more than 2GB RAM, the 
"hw.physmem" control returns -1.  MLton was interpreting this value as a 
"size_t" (an unsigned integral type), corresponding to 2147483648. 
Interestingly, the "sysctl" utility (/usr/sbin/sysctl), also seems to 
interpret "hw.physmem" as an unsigned value:
   istanbul:~ fluet$ sysctl hw.physmem
   hw.physmem: 2147483648

Hence, MLton really didn't see more than 2GB of physical RAM, used 
"ram-slop 1" to believe that it had use of 2GB of RAM, and would tune the 
garbage collection strategy to avoid paging under that assumption.

So, it turns out that with mlton-20051202, one can use "ram-slop 1.99" in 
this situation to get MLton to believe that it has the use of 4GB (minus 
epsilon) of RAM.  (Don't use "ram-slop 2", else the computation will 
wrap-around and MLton will believe that it has the use of 0b of RAM.)
Now, the contiguous heap restriction will kick in, and there will be some 
limit (due to virtual address space fragementation) on the size of heap 
that MLton will be able to allocate.  But, hopefully, it will be somewhat 
larger than 2GB.  It would be nice to use the "fixed-heap <n>" runtime 
option to try to grab a big heap up front, but there is a check that gives 
an error upon asking for a heap larger than INT_MAX bytes (which, on a 
32-bit system, means an error upon asking for a heap larger 
than 2GB).

In the forthcoming release of MLton, this behavior shouldn't arise.  We're 
now using the "hw.memsize" control on Darwin, which has type CTLTYPE_QUAD, 
which is interpreted as a "uint64_t".  Hence, MLton will see the right 
amount of physical memory.

Note, however, that if a machine has more than 4GB of physical memory, but 
the operating system only provides a 32-bit user space (as Darwin 
currently does), then MLton will never believe that it has the use of more 
than 4GB of RAM -- we can't make use of more than the virtual address 
space.



More information about the MLton mailing list