space profiling

Stephen Weeks MLton@sourcelight.com
Mon, 27 Aug 2001 15:32:56 -0700


> No comment on my suggestion for heap profiling?

I just took a look.  I am trying to sort through the many issues in my head.
Here are some thoughts.

* I like your idea for inserting instructions so that the code generator doesn't
  have to know anything about how profiling works or how whatever statements get
  inserted muck with registers.  I think that is the way to go and should be
  applicable to any scheme, if needed.

* I like the idea of separating the constraints on in-memory representation of
  profiling information from the constraints on file format.

* I like the idea of using the same sparse, unsorted, file format both for space
  and time. 

* I agree that for time profiling, we should stick with an array of 32 bit bins,
  one bin per byte of code.

* I think that for space profiling, we should use an array of 96 bit bins, one
  bin *per allocation point*, not per byte of executable.  The bin stores the 32
  bit code address and the 64 bit counter.  Each allocation point is assigned a
  unique integer.  With this approach, we might even be able to use Matthew's
  idea for one increment instruction to do the bump (the table and the unique
  integer are compile/link time constants -- the only unknown at runtime is
  the allocation amount, and then only in the case of arrays).  The number of
  allocation points, even for a self-compile, is small enough to make this
  approach fine.  I just measured and there are 55,651 allocations and 1,922
  array allocations in a self compile.

* I am not willing to give up a run time factor of four for space profiling, and
  probably not even a factor of 2.  I would like to continue brainstorming until
  we find an approach that we are all happy with that has hopefully no more than
  a 10% or 20% hit.  Maybe what I suggested above meets this?