[MLton] Multicore CPU's and MLton

John Skaller skaller@users.sourceforge.net
Wed, 06 Jul 2005 22:43:56 +1000

Content-Type: text/plain
Content-Transfer-Encoding: quoted-printable

On Wed, 2005-07-06 at 00:15 -0700, Stephen Weeks wrote:
> > It is highly application dependent whether processes or threads are
> > better, in fact complex applications may need BOTH.
> Agreed.  I didn't mean to rule out concurrent processes + threads
> within a process.  The question was whether there was a need for
> *concurrent* threads within a process, given that one can already take
> advantage of multi-core (or SMP) using concurrent processes.  I can't
> tell for sure, but the telco application you describe seems to be of
> the concurrent process + non-concurrent thread variety.  And MLton
> supports that model fine.

Depends which part of the application we're talking about :)

The call handling basically works just fine with a large
number of completely independent processes -- it makes
no difference if they're concurrent or not.

The test system used a 12 CPU Solaris box, with 11
worker threads running concurrently, one per CPU, plus
plus a master CPU which did everything else: I/O, load balancing,
dispatching events, etc.

The call handling part would work fine with processes
instead of threads. However, it is easier for load balancing
and dispatch to use threads: incoming events from the=20
telephony switch (which come in over TCP/IP connection)
get unpacked in memory by the master, and it is faster
to just give the chosen 'thread' representing a phone
call a pointer to the data than have to transmit it.

So the shared memory is important for performance.
The data comes in encoded as ASN1.1 .. which can
be quite expensive to decode .. we don't want to
do it twice.

Hope that makes sense --=20

> Again, I'm not clear if you're arguing for concurrent threads or just
> threads.

Neither -- I'm suggesting YMMV .. :)

> OK, here you seem to be arguing for concurrent threads.

Yes, I'm saying *sometimes* they're useful: a large
complex application won't be just=20

(A) a collection of processes communicating with messages


(B) a collection of threads sharing all of memory

Either of these isn't a complex application by definition:
these are just simplistic and extreme threading models.
A typical nasty application (like telco stuff) is likely
to have a complex threading model.

So having tools available, such as=20

(A) a sequential collector


(B) a concurrent collector

will be useful -- not to choose one or the other for
the whole application, but rather to try to structure
the data and control model so you manually manage
things at interface boundaries .. and let the collectors
handle things within a specific extent.

For example: the 12 worker CPU's could each run 50K microthreads
using a standard collector in a single address space, whilst
the master uses a concurrent collector for its own internal

Or, you could run a separate collector for each microthread,
or all three together .. :)

I think the bottom line is: don't reject the idea of a=20
concurrent collector just because it would be slow:
if a design calls for part of the application to be
concurrent with shared memory and this needs locking,
why should a concurrent collector be slower than
a manual way of doing that? And even if it is,
couldn't it be a lot more reliable and easy to work
with than manual memory management?

John Skaller <skaller at users dot sourceforge dot net>
Download Felix: http://felix.sf.net

Content-Type: application/pgp-signature; name=signature.asc
Content-Description: This is a digitally signed message part

Version: GnuPG v1.2.5 (GNU/Linux)