Type-Preserving Garbage Collectors
Daniel Wang
danwang@CS.Princeton.EDU
25 Jan 2001 17:34:18 -0500
"Stephen Weeks" <sweeks@intertrust.com> writes:
> Hi Dan. Henry mentioned your POPL talk to me, so I read through your paper --
> it was very interesting. Your paper mentions possible problems with code bloat
> due to specialized gc code per type. Also, Henry mentioned to me the possible
> problem of icache pressure in the gc code with many different object types.
> Just for fun, I instrumented MLton's runtime to count the number of different
> object headers seen. I was curious to know for a large program like a self
> compile how many there are.
>
> Here are the measurements for the CPS syntax tree that is produced after all
> optimization during a self compile. The header column gives the object header
> and the count column gives the number of objects with that header.
>
> header count
> -------- ------
> 00000001 302302
> 00020000 3
> 00008000 13035
> 80000001 859615
> 80000002 943930
> 80000003 285951
> 80000004 1437
> 80000005 5026
> 80000006 820
> 80000007 2503
> 80000008 1
> 8000000a 1002
> 80008000 90164
> 80008001 312513
> 80008002 530275
> 80008003 555661
> 80008004 45095
> 80008005 269
> 8000800a 387
> 80010000 1717
> 80010001 67396
> 80010002 4134
> 80018000 1
> 80038000 1
>
> The upshot is that there aren't that many distinct headers. There may be more
> object types, because the object type has to recursively describe the object
> types that its pointers point to. But my guess is that isn't too bad either,
> especially if you make the only the distinctions that are necessary for the gc,
> i.e. between pointers and nonpointers.
Hey, this is nice fodder for my thesis... thanks... There are some tricks
one can play to reduce the number of distinct object types so, i think
you're right that this is perhaps less of an issue than it seems in practice...
> Anyways, I was curious if you had made any progress in porting your stuff to a
> later version of MLton and making it work on larger programs.
Haven't had time to try.. I tried a while, back and noticed that most of the
changes were reording of code in different places... but the one change that
confused me was the removal of polymorphic primitive operators. My code
relied on them to do something dirty, and I didn't have time to think about
how to fix it. I'm still trying to spend most of my time cranking out TeX
rather than new numbers... but when I get back into SML hacking mode
... I'll let you know
> Also, I have one minor correction for the paper. MLton uses a standard
> *breadth* first Cheney copying collector, not *depth* first as it says in the
> paper.
Really, I must have misunderstood the code at some point... I'll make sure
hmmm... the BFS/DFS difference might explain some more cases where my safe
collector is faster, becuase of locality issues...