[MLton] Question on profile.fun

Stephen Weeks MLton@mlton.org
Wed, 1 Jun 2005 11:24:56 -0700


> You believe that the move of a constant integer to a known slot in
> the gc state at transitions in the profile graph is too intrusive?

Yes.  The point is that it happens all the time, not just at (SSA)
nontail calls, and not just at (SSA) basic block entries.
Furthermore, to implement this portably within MLton, the right place
to put it is at the Machine IL, which means it will interfere with
codegen optimizations too.  I bet it'll hurt more than 50% on some
benchmarks.  That's a lot of skew.  I'm already annoyed by the skew
that we get with -profile time as it is, 20-30% on some benchmarks
IIRC, although it would be worth rerunning to see where we are today.

To me, the approach is at odds with the principle of not interfering
with what you're measuring.  It seems well worth a few lines of
platform-specific code to get more reliable time profiling results.
In fact, I feel more comfortable with the reliability of time
profiling across platforms with the less intrusive approach, even
though the code is platform specific.  We don't have to prove that
we're not interfering on every platform.

In any case, you're welcome to try the experiment.  It would be
interesting to know.

> It would seem to simplify GC_handleSigProf down to:
...
> which would appear to reduce the time handling a profile signal.

The time to handle the signal doesn't matter (well, you can't walk the
whole control stack).  That happens only 100 times per second.  That's
why it makes more sense to put work there rather than every few
instructions.