[MLton] memory usage on darwin
Matthew Fluet
fluet at tti-c.org
Thu Aug 2 15:59:01 PDT 2007
In "An Experimental Analysis of Self-Adjusting Computation"
(http://www.mlton.org/References#AcarEtAl06), the authors state:
We ran our experiments on a 2.7GHz Power Mac G5 with 4GB of memory. We
compile our benchmarks with the MLton compiler using
"-runtime ram-slop 1 gc-summary" options. The "ram-slop 1" option
directs the run-time system to use all the available memory on the
system -- MLton, however, can allocate a maximum of about two Gigabytes.
I had assumed that the 2GB limit was a consequence of MLton allocating a
contiguous heap; that is, with virtual address space fragmentation, one
might not be able to allocate more than a 2GB heap.
However, in cleaning up the total RAM computation in light of the
ambiguities in the "sysctl" interface, I was looking at the <sys/sysctl.h>
header on various Darwin (Mac OS X) and *BSD systems. On some systems
(including Darwin), the "hw.physmem" control has type "CTLTYPE_INT", which
is interpreted as a "signed int".
It seems that on Darwin, when the machine has more than 2GB RAM, the
"hw.physmem" control returns -1. MLton was interpreting this value as a
"size_t" (an unsigned integral type), corresponding to 2147483648.
Interestingly, the "sysctl" utility (/usr/sbin/sysctl), also seems to
interpret "hw.physmem" as an unsigned value:
istanbul:~ fluet$ sysctl hw.physmem
hw.physmem: 2147483648
Hence, MLton really didn't see more than 2GB of physical RAM, used
"ram-slop 1" to believe that it had use of 2GB of RAM, and would tune the
garbage collection strategy to avoid paging under that assumption.
So, it turns out that with mlton-20051202, one can use "ram-slop 1.99" in
this situation to get MLton to believe that it has the use of 4GB (minus
epsilon) of RAM. (Don't use "ram-slop 2", else the computation will
wrap-around and MLton will believe that it has the use of 0b of RAM.)
Now, the contiguous heap restriction will kick in, and there will be some
limit (due to virtual address space fragementation) on the size of heap
that MLton will be able to allocate. But, hopefully, it will be somewhat
larger than 2GB. It would be nice to use the "fixed-heap <n>" runtime
option to try to grab a big heap up front, but there is a check that gives
an error upon asking for a heap larger than INT_MAX bytes (which, on a
32-bit system, means an error upon asking for a heap larger
than 2GB).
In the forthcoming release of MLton, this behavior shouldn't arise. We're
now using the "hw.memsize" control on Darwin, which has type CTLTYPE_QUAD,
which is interpreted as a "uint64_t". Hence, MLton will see the right
amount of physical memory.
Note, however, that if a machine has more than 4GB of physical memory, but
the operating system only provides a 32-bit user space (as Darwin
currently does), then MLton will never believe that it has the use of more
than 4GB of RAM -- we can't make use of more than the virtual address
space.
More information about the MLton
mailing list