[MLton] More on parallelism
Matthew Fluet
fluet at cs.cornell.edu
Tue Dec 5 08:34:23 PST 2006
>> (Talk about parallel implementation)
>
> I can't say much about the garbage collection, my background being in OS.
> However, it's probably a better idea to do an 1:1 (aka N:N) thread model for
> the following reasons:
>
> - Presumably, the pthread library (or other OS-specific library) will also be
> doing some sort of multiplexing, though maybe not. The M:N model has fallen
> out of favor, and if I'm not mistaken, even Solaris (the originator of it)
> has abandoned it for 1:1. What we don't want is both the thread library and
> the ML system doing M:N...
I don't think that it is as bad as you fear. It also really depends on
your application scenario. CML encourages _lots_ of threads and can
support them by having a very lightweight representation of a suspended ML
execution.
Think of it this way: Just as ML encourages writing lots of little
(anonymous) functions, while C encourages writing a few big (named)
functions. At the next level up, CML encourages using lots of little
threads, while Pthreads encourages using a few big threads.
I know I'm oversimplifying here, but I believe the gist is correct.
> - The C stacks will be allocated per call to pthread_create (or whatever).
> There needs to be one C stack per MLton thread. Now, pthreads allocates C
> stacks with malloc, so that could be an issue too.
Well, a MLton thread (whether in the single-threaded or multi-threaded
case) never uses the C stack except for making FFI calls. The entire
state of the ML thread is stored on the ML heap.
My main point here is that we already know how to multiplex one OS-level
thread for multiple ML threads. Typical CML applications will require
many more threads than are comfortably supported by OS-level threads.
> As for garbage collection, how viable is this? One thread does garbage
> collection. Most of the time it waits on a condition variable. All other
> threads share one huge heap. When they run out of space, they signal the
> garbage collector. The real challenge is that every thread would have to be
> stopped before the collector could run, I would think. Again, I'm not very
> familiar with garbage collection algorithms.
This is essentially the right way to handle a single-threaded garbage
collector that collects a heap mutated by multiple ML threads. The way
you stop other threads is force them to enter the garbage collector, by
setting their limit (i.e., the ammount of free space they have in the
heap) to 0.
More information about the MLton
mailing list