[MLton] Representation of strings in the FFI

Matthew Fluet fluet@cs.cornell.edu
Wed, 27 Apr 2005 09:08:47 -0400 (EDT)


> >> It is quite possible.  The exported string is both a char* and a
> >> pointer to a MLton array, from which one can extract the length using
> >> GC_arrayNumElements.
> > 
> > Ok - the layout of arrays were described in runtime/gc.h. The length
> > and other GC info is to the left of contents of the array (or string).
> > I take it, I should use GC_arrayNumElementsp at the C level.
> 
> Here is a small example:
> 
>    val e2 = _export "bar" : int array -> int array;
>    val _  = e2 (fn (x) => Array.fromList [1, 2, 3]);

Yes, but you need to be very careful.  If your exported function takes and
returns ML heap allocated objects (like strings/arrays/vectors), then you
need to be aware that at any ML garbage collection (either in this
exported function or in another exported function called after this one
returns), these C-side pointers will be invalidated -- because the object
has been moved by the garbage collector and/or because the garbage
collector does not believe that the object is live.

There are really no (good) facilities for constructing ML heap values from
the C side -- the ones in gc.h are sufficient for the compiler itself, but
there is a lot of shared knowledge between the GC and the compiler.  And,
there are no facilities whatsoever for registering pointers with the
runtime as garbage collection roots.

> the generated C code is:
> 
>    Pointer MLton_FFI_Pointer_array[1];
>    Pointer *MLton_FFI_Pointer = &MLton_FFI_Pointer_array;
>    Int MLton_FFI_op;
> 
>    Pointer bar (Pointer x0) {
>            MLton_FFI_op = 1;
>            MLton_FFI_Pointer_array[0] = x0;
>            MLton_callFromC ();
>            return MLton_FFI_Pointer_array[0];
>    }
> 
> Suppose I at the C level want to produce an MLton array and
> then call bar with it. Is GC_arrayAllocate the best way of allocating
> the array?

GC_arrayAllocate will simply allocate the space. You will need to go
through and initialize all the elements.  You certainly cannot just hand
off a malloc-ed area of space to an ML function -- that will completely
confuse the garbage collector.