[MLton] Callback functions: how?

Wed Feb 14 11:36:54 PST 2007

Wesley W. Terpstra wrote:
> What's the most efficient way to implement callback functions in MLton?

Good question.

> As I understand it, the code
>> val fns : (int -> unit) list ref = ref [ fn _ => () ]
>> val register f = fns := f :: !fns
>> val runAll x = List.app (fn f => f x) (!fns)
> will prevent flow-analysis and the runAll method will have a loop over a 
> giant switch statement that could call all possible functions of type 
> 'int -> unit'. Is this correct? Or can MLton recognize that only methods 
> passed to 'register' need to be in the switch statement?

I believe that MLton does something in between the two.  That is, 
flow-analysis is (purposefully) conservative on functions that escape 
into mutable objects, but it does distinguish between escaping and 
non-escaping functions.  So, you won't get a dispatch among all possible 
functions of type 'int -> unit'.  On the other hand, if you had 
'register1' and 'register2' that added to different refs, then you would 
get a dispatch among the set of functions passed to either 'register1' 
or 'register2'.

There are ways of improving the precision, but they add analysis time 
and didn't demonstrate a lot of improvement; see the 
Cejtin/Jagannathan/Weeks; ESOP 2000 paper.

> I'm trying to wrap the standard C idiom of 'void registercb(const char* 
> name, void (*cb)(void* uarg, ...), void* uarg);'. My best idea so far is 
> to use the 'uarg' as a word that is the index into some SML-side vector 
> of callback functions. eg:
>> local
>>   val fns = GrowingVector.empty
>>   fun runOne (id, x) = GrowingVector.sub (fns, Word.toInt id) x
>>   val () = _export "mlton_lib_ufnhook": (word * ... -> unit) -> unit; 
>> runOne
>>   val runOne_addr = _address "mlton_lib_ufnhook" : MLton.Pointer.t;
>>   val Cregistercb = _import "registercb" : string * MLton.Pointer.t * 
>> word -> unit;
>> in
>>   fun registercb (name, f) = registercb (name, runOne_addr, 
>> Word.fromInt (GrowingVector.insert f))
>> end
> Is there a better way? Can MLton (still) recognize that only 
> registercb'd methods need be in the switch?

I think that this is the currently best known solution.  As noted above, 
MLton will only merge escaping functions of the same type.  If the only 
way for a function of type 'word * int -> unit' to escape is to be 
passed to 'registercb', then you'll get a switch of exactly those.  But, 
if you have different ways of registering functions with the same type, 
you'll get a switch over all of them.

> This idiom appears in enough C libraries that we should really have a 
> good solution for this in the FFI section of the wiki. If a good 
> solution is relatively complex, perhaps we should offer a small library 
> (I'l volunteer to write it). For my scenario, it's quite important that 
> this be as fast as possible---callbacks are invoked inside a tight loop, 
> and the callbacks themselves are very simple.

It's hard to come up with a general library, since the type of the 
callback functions often change from C library to C library.

To complicate matters, some C callbacks don't give you a void* to hang 
callback specific data; I think John Reppy has mentioned that some 
OpenGL bindings are of that form.  For that, we'd like to eventually 
support an "indirect export" mechanism:

   _export * : (int -> unit) -> MLton.Pointer.t;

which, upon each execution, would allocate a new code stub (not in the 
ML heap, since we can't move the code pointer after giving it to C) for 
calling the provided ML function.  Under the hood, it would work very 
much like your solution, using some sort of key lookup, but the key 
would be written directly into the code stub.

> Also, what's the best known way to implement GrowingVector? I've been using
>> datatype 'a used = FREE of int | USED of 'a
>> type 'a t = { free: int, buf: 'a used array }
> where buf doubles in size when free = ~1. This isn't such a big deal, 
> since 'registercb' is rarely invoked compared to 'runOne'. However, I'd 
> like to know a better solution.

That's pretty close to the resizeable-array.fun implementation in the 
MLton sources.