[MLton] Callback functions: how?

Wed Feb 14 12:14:56 PST 2007

On Feb 14, 2007, at 8:36 PM, Matthew Fluet wrote:
> Wesley W. Terpstra wrote:
>> As I understand it, the code
>>> val fns : (int -> unit) list ref = ref [ fn _ => () ]
>>> val register f = fns := f :: !fns
>>> val runAll x = List.app (fn f => f x) (!fns)
>> will prevent flow-analysis and the runAll method will have a loop  
>> over a giant switch statement that could call all possible  
>> functions of type 'int -> unit'. Is this correct? Or can MLton  
>> recognize that only methods passed to 'register' need to be in the  
>> switch statement?
>
> I believe that MLton does something in between the two.  That is,  
> flow-analysis is (purposefully) conservative on functions that  
> escape into mutable objects, but it does distinguish between  
> escaping and non-escaping functions.  So, you won't get a dispatch  
> among all possible functions of type 'int -> unit'.  On the other  
> hand, if you had 'register1' and 'register2' that added to  
> different refs, then you would get a dispatch among the set of  
> functions passed to either 'register1' or 'register2'.

To help MLton out, I could create a locally defined type and add it  
to the callbacks input. eg:
> local
>   datatype secret = SECRET
> in
>   fun register f = fns := (fn (SECRET, x) => f x) :: !fns
> end
Would this be enough to ensure that it caught exactly the right methods?

> There are ways of improving the precision, but they add analysis  
> time and didn't demonstrate a lot of improvement; see the Cejtin/ 
> Jagannathan/Weeks; ESOP 2000 paper.

That's fine. This is by nature specific to MLton, so if I can make  
the precision exact by using a compiler-specific trick, that's ok.

>> This idiom appears in enough C libraries that we should really  
>> have a good solution for this in the FFI section of the wiki. If a  
>> good solution is relatively complex, perhaps we should offer a  
>> small library (I'l volunteer to write it). For my scenario, it's  
>> quite important that this be as fast as possible---callbacks are  
>> invoked inside a tight loop, and the callbacks themselves are very  
>> simple.
>
> It's hard to come up with a general library, since the type of the  
> callback functions often change from C library to C library.

I know there are other callback techniques. However, all C libraries  
developed in the last while use this idiom. A good solution where we  
can provide it seems fine to me.

> To complicate matters, some C callbacks don't give you a void* to  
> hang callback specific data; I think John Reppy has mentioned that  
> some OpenGL bindings are of that form.  For that, we'd like to  
> eventually support an "indirect export" mechanism:
>   _export * : (int -> unit) -> MLton.Pointer.t;

I remember we talked about this before. However, I wonder if it's  
truly necessary. If there is no user argument, the library probably  
intends for you to hook exactly one function. In which case, you can  
use the _export*_address trick explicitly for the one function. A  
single ref cell will hold the SML method to call. Easy.

>> Also, what's the best known way to implement GrowingVector? I've  
>> been using
>>> datatype 'a used = FREE of int | USED of 'a
>>> type 'a t = { free: int, buf: 'a used array }
>> where buf doubles in size when free = ~1. This isn't such a big  
>> deal, since 'registercb' is rarely invoked compared to 'runOne'.  
>> However, I'd like to know a better solution.
>
> That's pretty close to the resizeable-array.fun implementation in  
> the MLton sources.

I imagine you don't have a free-list, though. ;-)

Vesa Karvonen wrote:
> Basically, there is a single exported ML callback function and a
> callback cache for each imported C function that takes callbacks.
> The key generated by the cache is given to the C side as the
> context pointer (uarg in your snippet).
This sounds pretty much the same as my solution, except that my hash  
function is (fn x => x) and I don't need a collision policy.