[MLton] Extra GC pressure

Matthew Fluet fluet at tti-c.org
Thu Feb 22 07:11:34 PST 2007

Wesley W. Terpstra wrote:
> At the moment, the GC only cleans  up memory when the ML heap is full or 
> explicitly requested. Memory usage from C data structures is not counted 
> towards when the RAM is exhausted. How hard would it be to add 
> MLton.GC.{add,sub}Memory that kept track of additional memory pressure 
> in the GC?

Well, MLton's tracking of available memory is a lot less sophisticated 
than you may realize.  Essentially, at the beginning of the program, we 
calculate the physical memory (under most *nix variants with 
sysconf(_SC_PAGESIZE) and sysconf(_SC_PHYS_PAGES)), then multiply by 
ram-slop (defaults to 0.5, but configurable via @MLton ram-slop dd --). 
  For the duration of the execution, MLton 'believes' that this is the 
available ram.

Now, a mmap to (re)allocate the heap might subsequently fail, and we 
slowly bring down the requested heapSize until it succeeds.  But, in 
general, MLton will have allocated a contiguous heap approximately equal 
to ram-slop * totalRam, and GC when this heap is full.  (For many 
programs that don't need to invoke the generational or mark-sweep 
collector, 'filling' this heap means filling half of it, and using the 
other half as the semi-space in a copying collection.)

The real point being that MLton uses a contiguous heap, so we want to 
allocate that heap early and keep reusing it; it is best to grab the 
heap before a lot of C code gets a chance to fragment the virtual memory 
space with individual mallocs.

While it wouldn't be hard to add MLton.GC.{add,sub}Memory, I'm not sure 
how we could best act on that.  We could artificially manipulate the 
limit pointer, thereby triggering a GC earlier.  But, you'd need to be 
careful about rolling the limit pointer beyond the frontier or beyond 
the true limit.

> The idea being that if you have an SML object that is backed by a C 
> object (for example SQLite, GTK, etc), you can indicate to the GC how 
> much memory (approximately) the C-side is consuming. When you free that 
> memory, you tell the GC that the memory pressure has been reduced. The 
> goal of this is to help trigger a GC earlier to clean up SML objects 
> (and thus their attached C state).
> A better interface would probably be MLton.GC.alloc <n> which returns an 
> opaque type. When that type is GC'd, the <n> bytes are additionally 
> subtracted from the extra memory pressure counter.

I can imagine the situation you describe, but have you encountered it in 

More information about the MLton mailing list