[MLton] gc question
Matthew Fluet
fluet at tti-c.org
Thu Jan 18 23:22:11 PST 2007
> I am looking into implementing the Cheng-Blelloch garbage collector for
> MLton (as part of my grad school work).
Great.
> I am confused about how register allocation and spilling in a codegen
> work with the stack frame layouts used by the gc, and with stack limit
> checking. The frame layouts don't include space for spilled registers,
> but this is OK since nothing in a register is live at gc calls, so the
> spilled registers don't need to be traced--is that right?
Correct.
> How do the
> stack limit checks account for the stack space used for spilled
> registers?
There is no need to, since spilled registers don't get spilled to the
(ML) stack. The input to the codegens is a Machine IL program. Such a
program has effectively two forms of function local operands: stack
slots and (pseudo-)registers. The Machine IL register allocator
partitions all of the temporaries in a Machine IL function into these
two sorts of operands; as you note above, stack slots correspond to
temporaries that are live over a gc point, while pseudo-regs correspond
to temporaries that are never live over a gc point.
The C codegen sees the stack operands (on the ML stack) as memory
locations, so read/write/move with these operands is a memory
read/write/move. The C codegen sees the pseudo-reg operands as local
variables of the C function which implements a particular Machine IL
function. It is certainly the case that we may have more pseudo-regs
live at a program point than there are physical registers. The C
compiler simply spills to the C stack, so the space for the spilled
registers is added to the size of the C function frame. Since we use a
trampoline mechanism for switching between C functions, there is only
ever one C function on the C stack at a time and we can effectively
handle arbitrary amount of spilling to the C stack.
The native x86 codegen could use the same trick: spill to the top of the
C stack and recover the spill space at every inter-Machine IL function
transfer. We're guaranteed that no spilled pseudo-reg is live at such a
transfer. However, the native codegen uses a slightly different trick,
which is to note that a single threaded execution means that only one
Machine IL function is ever executed at a time, so all Machine IL
functions can share the same global spill space. The size of this spill
space is bounded by the number of pseudo-regs in any Machine IL function
(plus a few extra slots to handle primitives that need a couple of extra
temporaries that may need to be spilled). We'll need to go to spilling
to the top of the C stack when we support multiple ML threads executing
at the same time.
> The Cheng-Blelloch collector is incremental and uses a write barrier on
> heap writes. I'm trying to see in what MLton pass it would be best to
> add the write barrier; it seems like the Machine IL has all the writes
> explicitly, and perhaps the heap writes could be determined from the
> operands here (or perhaps more information is needed from earlier
> passes). Any thoughts on this?
The Machine IL or the RSSA IL is definitely the place to insert the
write barrier. There is a form of write barrier for the card marking
when the generational collector is being used; this is inserted in the
SSA-to-RSSA pass. As with card marking, you may benefit from MLton's
accurate type information which ensures that you know (at compile time)
whether or not a word sized data element is a pointer or not.
More information about the MLton
mailing list