This is the mail archive of the
guile@sourceware.cygnus.com
mailing list for the Guile project.
Re: GC and threads (was Re: generational GC)
> Every traversal algorithm divides nodes into three classes:
> 1) Nodes we've visited and dealt with. Call these "black" nodes.
> 2) Nodes we've not visited, or seen references to. Call these "white"
> nodes. At the end of the traversal, any nodes that are still white
> are garbage.
> 3) Nodes we've found pointers to, but haven't yet visited. Call these
> "grey" nodes. For example, if you're using a mark stack, the nodes
> on the stack are grey.
>
> I don't understand the distinction between grey and black here. In
> what sense can we have found a pointer but not "visited" the
> corresponding node?
I'm not sure of the best way to describe what "grey" means. But every
algorithm for traversing a graph has some set of waiting nodes. A
recursive traversal has the stack; a breadth-first traversal has the
queue. The classic Cheney copying algorithm has those cells that have
been copied to to-space but not scanned. In each algorithm, the
waiting cells are the grey cells.
> Is this perhaps an implementation idea to avoid marking loops?
No. That's dealt with using mark bits, or address range tests for tospace.
> In this terminology, your strategy would be described as "allocate
> black." That is, new nodes are assumed to be live, and are left alone
> until the next GC cycle begins, at which point everything is colored
> white.
>
> The problem with allocating black is that the new cells can't be
> collected until the end of the *next* GC cycle, even though they're
> the most likely to die quickly.
>
> Right. So the basic tradeoff here is
>
> (memory loss by not collecting the set of cells that are most likely
> to have died young)
>
> versus
>
> (performance advantages of doing GC asynchronously on a dedicated
> thread rather than "in-line" on program threads).
If you're assuming that one must allocate black if one GC's on a
dedicated thread, that's not true. You can GC in a dedicated thread,
but still allocate white sometimes.
> That would be very cool, but, if GC is asynchronous, I don't see how
> it can be implemented efficiently, i.e. without requiring a mutex
> acquisition inside SCM_NEWCELL.
I'm not convinced that grabbing a mutex is that much more expensive
than grabbing thread-specific data. Either one involves some kind of
call to a thread library routine.