[MLton-user] Calling into SML from C
Matthew Fluet
fluet at tti-c.org
Sat Apr 14 12:26:01 PDT 2007
>> What constitutes "lots and lots of memory" and "lots of time to be
>> spent in GC"?
>
> Several hundred MB for a program that does no explicit heap allocation,
> and, like, 60% time in the GC.
I see approximately the same behavior (with 50% GC time):
fenrir:~/tmp/export fluet$ cat export.sml
val e = _export "f": (int * real * char -> char) -> unit;
val _ = e (fn (i, r, _) => #"g")
val g = _import "g": unit -> unit;
val _ = g ()
val _ = print "success\n"
fenrir:~/tmp/export fluet$ cat ffi-export.c
#include <stdio.h>
#include "export.h"
void g () {
Char8 c;
fprintf (stderr, "g starting\n");
for (int i = 0; i < 1000000; i++)
c = f (i, 17.15, 'a');
fprintf (stderr, "g done char = %c\n", c);
}
fenrir:~/tmp/export fluet$ mlton -export-header export.h -default-ann
'allowFFI true' export.sml ffi-export.c
fenrir:~/tmp/export fluet$ ./export @MLton gc-summary --
g starting
g done char = g
success
GC type time ms number bytes bytes/sec
------------- ------- ------- --------------- ---------------
copying 226 10,417 107,211,864 474,388,778
mark-compact 0 0 0 -
minor 0 0 0 -
total GC time: 3,729 ms (47.7%)
max pause: 0 ms
total allocated: 867,845,084 bytes
max live: 10,392 bytes
max semispace: 94,208 bytes
max stack size: 360 bytes
marked cards: 0
minor scanned: 0 bytes
While there is a lot of allocation and garbage collection going on, the
most relevant statistic is the max live, which indicates that there was
never more than 10K live data. This means that there is no space leak,
and what data is being allocated is short lived.
>> And there will almost certainly be some allocation done in calling any
>> exported ML function. So, I would expect that if you write a loop in
>> C that does nothing but repeatedly call an exported ML function, then
>> you will see memory being allocated (and subsequently GCed). Do you
>> have example code that demonstrates the situation you are seeing?
>
> This is probably what I'm seeing. The program is essentially a loop
> around an exported function that takes a bunch of integers and returns
> unit. Is there any way to reduce this effect?
There doesn't appear to be a simple solution. When an exported ML
function is called via C, it is run in its own ML thread. The
allocation overheard you are seeing arises from some of the thread
switching code.
I will also point out that 50% GC time for a program that is doing
absolutely no computation isn't something to be particularly worried
about. A real program will spend much more time in computation than in GC.
More information about the MLton-user
mailing list