I'm implementing mark-and-sweep garbage collection in a simple scripting language API I'm working on and have been reading about various implementations of garbage collection. API's such as Lua use mark-and-sweep with white, gray and black lists.
The problem is, I can't seem to find explanation of why there are such lists and why they are put into these specific colours.
In my current, trivial implementation, I simply use "dead" or "alive" states. In the sweep, the dead objects are deleted. I'm using the native heap so I'm not doing any moving in my GC.
I'm writing it in C.
// Performs a full garbage collection
void GorCollect(GorContext *ctx)
{
Value *v, *end;
Collectable *c, *n;
// mark stack references
end = ctx->stack + ctx->stackTop + 1;
v = ctx->stack;
while(v != end)
{
if (gvisgc(v) && v->v.gc) // mark if a collectable obj
Mark(v->v.gc);
v = v++;
}
// mark global references
if (ctx->global)
Mark((Collectable *)ctx->global); // ctx->global is a collectable obj
// perform sweep
c = ctx->gchead; // full list of collectable objs
ctx->gchead = 0;
while(c) {
n = c->next;
// destroy unmarked collectable
if (!c->marked)
FreeCollectable(ctx, c);
// rebuild gc list (singly-linked)
else
{
c->marked = 0;
if (!ctx->gchead)
c->next = 0;
else
c->next = ctx->gchead;
ctx->gchead = c;
}
c = n;
}
}
Gray means "live but not scanned": not all the descendants of a gray block have been marked as black yet. The gray color is necessary in an incremental garbage-collector. It helps a mark-and-sweep GC continue what it was doing the next time it gets a chance to do a bit of marking.
If your GC is non-incremental, then it may look to you like you don't necessarily need the gray color: you can simply always recurse on all the children of any live block you encounter. However, another more subtle issue is that this naive non-incremental recursive algorithm may overflow the stack. A gray color also helps by allowing to store information about what to visit next in the heap, instead of the stack.
Even if you use the gray color for this purpose, it doesn't prevent you from keeping a buffer of blocks that remain to be visited for efficiency. The only difference with the naive recursive implementation is that you stop updating the buffer when it's already full, and you scan the heap linearly for gray objects if the buffer becomes empty after having been full.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With