[MLton] More on parallelism

Tue Dec 5 08:34:23 PST 2006

>> (Talk about parallel implementation)
>
> I can't say much about the garbage collection, my background being in OS. 
> However, it's probably a better idea to do an 1:1 (aka N:N)  thread model for 
> the following reasons:
>
> - Presumably, the pthread library (or other OS-specific library) will also be 
> doing some sort of multiplexing, though maybe not.  The M:N model has fallen 
> out of favor, and if I'm not mistaken, even Solaris (the originator of it) 
> has abandoned it for 1:1.  What we don't want is both the thread library and 
> the ML system doing M:N...

I don't think that it is as bad as you fear.  It also really depends on 
your application scenario.  CML encourages _lots_ of threads and can 
support them by having a very lightweight representation of a suspended ML 
execution.

Think of it this way: Just as ML encourages writing lots of little 
(anonymous) functions, while C encourages writing a few big (named) 
functions.  At the next level up, CML encourages using lots of little 
threads, while Pthreads encourages using a few big threads.

I know I'm oversimplifying here, but I believe the gist is correct.

> - The C stacks will be allocated per call to pthread_create (or whatever). 
> There needs to be one C stack per MLton thread.  Now, pthreads allocates C 
> stacks with malloc, so that could be an issue too.

Well, a MLton thread (whether in the single-threaded or multi-threaded 
case) never uses the C stack except for making FFI calls.  The entire 
state of the ML thread is stored on the ML heap.

My main point here is that we already know how to multiplex one OS-level 
thread for multiple ML threads.  Typical CML applications will require 
many more threads than are comfortably supported by OS-level threads.

> As for garbage collection, how viable is this?  One thread does garbage 
> collection.  Most of the time it waits on a condition variable.  All other 
> threads share one huge heap.  When they run out of space, they signal the 
> garbage collector.  The real challenge is that every thread would have to be 
> stopped before the collector could run, I would think.  Again, I'm not very 
> familiar with garbage collection algorithms.

This is essentially the right way to handle a single-threaded garbage 
collector that collects a heap mutated by multiple ML threads.  The way 
you stop other threads is force them to enter the garbage collector, by 
setting their limit (i.e., the ammount of free space they have in the 
heap) to 0.