[MLton] Question on profile.fun
Matthew Fluet
fluet@cs.cornell.edu
Wed, 1 Jun 2005 14:37:15 -0400 (EDT)
> > You believe that the move of a constant integer to a known slot in
> > the gc state at transitions in the profile graph is too intrusive?
>
> Yes. The point is that it happens all the time, not just at (SSA)
> nontail calls, and not just at (SSA) basic block entries.
> Furthermore, to implement this portably within MLton, the right place
> to put it is at the Machine IL, which means it will interfere with
> codegen optimizations too. I bet it'll hurt more than 50% on some
> benchmarks. That's a lot of skew. I'm already annoyed by the skew
> that we get with -profile time as it is, 20-30% on some benchmarks
> IIRC, although it would be worth rerunning to see where we are today.
Well, I added the -profile mark option which essentially adds the time
profiling labels, but doesn't install the sig handler (or produce a
mlmon.out file). I ran that through the benchmarks, and I believe that
it's still 20-30% overhead. So, I think the overhead is more from failing
to optimize in the presence of profiling statements than from gathering
the profile data itself.