I forgot to mention that the profiling code uses 4 bytes for each byte of code space more or less. Thus if you try to profile mlton, which has about 4 meg of code, it will allocate 16 meg for the table of bins. This won't be taken into account by the default max-heap stuff, so on a small machine you will probably start thrashing. I don't think that this is a big deal, but ...