profiling

Fri, 18 Jan 2002 17:45:31 -0500 (EST)

I finished updating the profiling information in the docs.
I think the MLton.Profile structure is nice, but we can probably do
better.  Here's an ambitious signature:

signature MLTON_PROFILE =
   sig
      val profile: bool

      type data

      val equals: data * data -> bool
      val get: unit -> data
      val new: unit -> data
      val reset: data -> unit
      val set: data -> unit
      val write: data * string -> unit
   end

Here's the vision: data is really an int array.  We calculate the size of
the array by externing the card variable in prof.c.  (I thought we could
actually calculate it just using nullary _ffi's, but the semantics of 
_ffi "x": int is to fetch the integer at label x, not to fetch the address
of label x.  Hacking a new _addr variant could accomplish this, but I
don't see much other use for it.)  Now, equals, reset and write can be
pure ML functions, just working on the array.  Atomicity is tricky, but I
don't think a big deal. 

That leaves get and set.  Obviously, we need to have the timer handler be
a C function, not an ML function, because we don't want to wait for the ML
signal handler to be invoked.  That means we need to register the array
with the profiler -- scary, because the array is going to move around
during GC's.  One solution would be to ensure that the array the profiler
touches is global; hence a pointer to it will be in GC_state.globals.  If
we can get the address of that pointer (???) and pass that to the
profiler startup code, then we should always be able to get to the array
via a dereference.  (globals[n] will always point to the array, so
(*(globals[n]))[pc] should be the slot we want to update.  We pass
&(globals[n]) to the profiler as buffP, so (*buffP)[pc] is the slot to
update.)

Better, would be to globalize an int array ref, but _ffi's can't have that
type.  Although, I don't see any obvious problems with:

_ffi "name": ty;
ty ::= u | s * ... * s -> u
t ::= u | t array | t ref | t vector
u ::= bool | char | int | real | string | unit | word | word8

(We don't currently allow recursion in t's.)

Then, I think we could give prof.c a function

static **unit buffP = NULL;
void setBuff(***uint p)
{  buffP = *p; }

Now, in the basis library implementation of MLton.Profile we have (I'm
mixing in some stuff that would be in primitive.sml; and some of these
should be predicated on the profile boolean):

val getCard = _ffi "MLton_Profile_getCard": void -> int;
val card = getCard ()

fun new () = Array.array(card, 0)
val dataR = ref (new ())
val dataRR = ref dataR
val setBuff = _ffi "MLton_Profile_setBuff": int array ref ref -> unit;
val _ = setBuff dataRR

val set = fn data => dataR := data
val get = fn () => !dataR

(Only trickery I see here is that we'd have to prevent localRef from
deglobalizing dataRR, but we could do that just by looking for
the MLton_Profile_setBuff FFI primitive.)

Anyways, maybe this needs a little more thought.  What really bugs me
about the current interface is that I can't easily profile a function that
is called multiple times.  What I really want to be able to write is a
function like:

val profBatch: string * ('a -> 'b) -> (('a -> 'b) * (unit -> unit))

So I can sum up over all executions of a function.  Right now, I'd have to
settle for:

val profBatch: string * ('a -> 'b) -> ('a -> 'b) =
   fn (f, string) =>
   let
      val r = ref 0
   in
      fn a =>
      let
        val _ = r := !r + 1
        val _ = MLton.Profile.reset ()
        val b = f a
        val _ = MLton.Profile.write (f ^ (Int.toString (!r)))
      in
        b
      end
   end

and run mlprof on all of the mlmon.out* files.