[MLton] cvs commit: Improved FFI.

Matthew Fluet fluet@cs.cornell.edu
Sun, 24 Jul 2005 11:42:28 -0400 (EDT)


> >   Incorporated Wesley's patch for improved FFI.  After the discussion on
> >   the list regarding C pointers, I went ahead and eliminated _address
> >   infavor of _symbol, which provides the address, getter, and setter.
> >   So, the FFI system now looks like:
> >   
> >   _symbol "symbol" [define] : ptrTy, cbTy;  ==> ptrTy * (unit -> cbTy) * (cbTy -> unit)
> >   _symbol * : ptrTy, cbTy;  ==> (ptrTy -> cbTy) * (ptrTy * cbTy -> unit)
> 
> I think this was a mistake.
> 
> There is now no way to take the address of a function.
> Which means you can't pass a MLton function to a C method that takes
> a function pointer. 

This isn't entirely true; as you note, you can fib in one form or another.

> However, I think this is worse than not having a type with address
> in the first place. I added it so we could encourage people to 
> provide the extra information against future problems. The new API
> encourages people to provide wrong information...

I don't seriously object to  _address "symbol": ptrTy, cTy;

where cTy includes both C base types and C function types (though, even
then, you have additional complications, since a C function type should, I
believe, carry it's calling convention -- if a function pointer doesn't
need to be the same size as an int pointer, why would a cdecl function
pointer need to be the same size as a stdcall function pointer?).  But,
the reality is that MLton expects all C pointers to be of the same size
(and equal to 32bits).

It is all well and good to argue that you can't make that assumption in C,
but until I hear an argument that doesn't boil down to "you can't make any
assumption about anything in C", then I don't see any other simple
recourse.  Because, every discussion about implementation dependent
details of C ultimately ends with "do whatever works on the platform(s)  
you are interested in (but don't be surprised if it doesn't work on other
platforms)."  And this representation works certainly for pointers to all
C base types on all architectures where MLton runs and for pointers to C
functions on architectures where people have been using the MLton FFI in 
non-trivial ways.

Don't get me wrong -- nobody objects to producing more standard conforming 
C code, but I don't see the cost/benefit ratio being worth the effort.  
The cost is certainly high, and the benefit seems simply to be the 
(dubious) claim that such C code would be less susceptible to being 
treated in a different way by future versions of gcc.  Of course, the 
gazillion lines of existing C code is the best defence against future 
versions of gcc breaking backwards compatibility.

There are other ancillary benefits, I admit.  In theory more standard 
conforming might make porting to other platforms mildly easier.  It may 
also make it possible to use another C compiler instead of gcc.  But, we 
haven't had any insurmountable problems porting to other platforms 
(admittedly, the signed/unsigned calling convention of PPC and the _start 
symbol of HPPA were minor nits, and likely the x86_64 port will uncover 
yet more) and there has been no reason to try another C compiler.

So, all that being said, adding a (simple) _address would be very easy 
in the current setup, it would simply need to abide by the current 
limitations.