[MLton] implement _address and _symbol
Matthew Fluet
fluet@cs.cornell.edu
Mon, 18 Jul 2005 10:12:03 -0400 (EDT)
> It seems there was consensus on these points:
> FFI that uses MLton.Pointer.t should be pointer-type transparent
Yes.
> Ok. However, there seems to be a contradiction here wrt _import *.
>
> '_import *: int -> int;' right now gives MLton.Pointer.t -> int -> int.
The above is not allowed by the current implementation. You must make the
pointer type explicit in the annotation.
> '_import *: MLton.Pointer.t -> int -> int;' ?
This is the correct annotation for the current implementation. And there
is no need to change it.
> That would break compatibility.
There is no compatibility issue, since we currently implement the desired
annotation.
> Ditto for _symbol *. It seems the right types are:
>
> _symbol "x": int; ==> (unit -> int) * (int -> unit)
> _symbol *: int; ==> MLton.Pointer.t -> (unit -> int) * (int -> unit)
>
> However, where does the pointer get specified?
We seem to have settled on
_symbol *: ptrTy, cbTy; ==> (ptrTy -> cbTy) * (ptrTy * cbTy -> unit)
> In fact, all of the ': ....;' syntax seems bogus to me.
> Where's the point in specifying all of this?
You are correct that it is not a proper ML type annotation, in the sense
that it specifies the type of the resulting expression. Rather it is a
type annotation that conveys just enough to nail down the type of the
expression. As I said before, the FFI primitives are not polymorphic
primitives, they are a family of primitives. The annotation selects which
member of the family.
This isn't a real suggestion, but one could imagine the following syntax:
_symbol[cbTy] "symbol";
_symbol[ptrTy,cbTy] *;
which makes it a little more clear that the type annotation is selecting a
particular primitive, which contributes to the type of the resulting
expression, but does not equal it. Likewise:
_address[ptrTy] "symbol";
_import[cfTy] "symbol"; or _import[argTy,resTy] "symbol";
_import[ptrTy,cfTy] *; or _import[ptrTy,argTy,resTy] *;
_export[cfTy] "symbol"; or _export[ptrTy,argTy] "symbol";
You can see my bias peeking through: knowing the implementation, I know
that that more explicit "or" alternatives are easier to implement.
Recalling that originally the only FFI primitive was _import of
C-functions, it becomes clear why adopting the ML style type annotation
made sense -- since in that (one) case, the annotation is the type of the
resulting expression.
> Another frightening aspect no one has brought up: what about pointers?
> val set : int vector -> unit = _store "x"
>
> This is extremely frightening (to me) since it seems the exported pointer
> can never be assumed to contain valid information. For _import this works,
> because you don't use the GC during the C function call.
That's not actually true. You can call a C function, which calls an
_export-ed ML function, during whose execution a GC may occur, so any ML
pointers that the C function had when control returns are not necessarily
valid. It is a (minor, as in relatively easily fixed) deficiency of the
runtime system that there is no way to register ML pointers with the
runtime to be treated as roots and updated at a GC.
> And what about
> val get : unit -> int vector = _fetch "x"
> Where does the length information come from?
The supposition is that the pointer in the symbol "x" is a (pointer to a)
ML vector. As above, with GC's occurring, it may be difficult to ensure
that the pointer is valid.
> I just compiled foo.sml:
> val ex = _export "test": int vector -> unit;
> fun out x = print (Int.toString x ^ "\n")
> fun app x = Vector.app out x
> val () = ex app
> ... this actually works, yikes.
>
> I can only assume that the programmer is required to only pass back SML
> arrays to SML functions; never arrays coming from C. After the C call
> which set the symbol, on return to SML the GC might run. Thus, _fetch
> doesn't make sense either.
>
> So, _fetch/_store of heap types should fail to compile, right?
Not necessarily, but possibly.
Bear in mind, this is an interface to *C*! The programmer is leaving a
type safe language, and so they had better know what is going on.
> (* These generate deprecated warnings (with suggested change): *)
> val somefnptr : MLton.Pointer.t -> int -> int = _import *: int -> int;
> val somevalptr : MLton.Pointer.t -> int = _import *: int;
Neither somefnptr nor somevalptr are currently accepted by the compiler.
> Comments?
I still prefer _symbol over _fetch/_store.
I don't mind that a 'define'-ed _symbol is not initialized; this is *C*
and that behavior is allowed. Furthermore, you might be defining a
symbol so that the C code can set it, and there is no need to initialize
it.
I don't think that type-inference is necessary; I think the current
annotations are fine. Also, whatever decision is reached wrt
type-inference, it would certainly make more sense to first implement
the new FFI with annotation before tackling inference as well.