This is the mail archive of the guile@sourceware.cygnus.com mailing list for the Guile project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: GC and threads (was Re: generational GC)

To: Neil Jerram <neil at ossau dot uklinux dot net>
Subject: Re: GC and threads (was Re: generational GC)
From: Jim Blandy <jimb at red-bean dot com>
Date: 24 Mar 2000 10:41:16 -0500
Cc: mlivshin at bigfoot dot com, guile at sourceware dot cygnus dot com, Greg dot Harvey at thezone dot net
References: <s3og87v1fm.fsf@verisity.com> <200003221448.OAA15850@ossau> <m3n1npqkih.fsf@savonarola.red-bean.com> <200003241055.KAA00777@ossau>


>    Every traversal algorithm divides nodes into three classes:
>    1) Nodes we've visited and dealt with.  Call these "black" nodes.
>    2) Nodes we've not visited, or seen references to.  Call these "white" 
>       nodes.  At the end of the traversal, any nodes that are still white
>       are garbage.
>    3) Nodes we've found pointers to, but haven't yet visited.  Call these
>       "grey" nodes.  For example, if you're using a mark stack, the nodes
>       on the stack are grey.
> 
> I don't understand the distinction between grey and black here.  In
> what sense can we have found a pointer but not "visited" the
> corresponding node?

I'm not sure of the best way to describe what "grey" means.  But every
algorithm for traversing a graph has some set of waiting nodes.  A
recursive traversal has the stack; a breadth-first traversal has the
queue.  The classic Cheney copying algorithm has those cells that have
been copied to to-space but not scanned.  In each algorithm, the
waiting cells are the grey cells.

> Is this perhaps an implementation idea to avoid marking loops?

No.  That's dealt with using mark bits, or address range tests for tospace.

>    In this terminology, your strategy would be described as "allocate
>    black."  That is, new nodes are assumed to be live, and are left alone
>    until the next GC cycle begins, at which point everything is colored
>    white.
> 
>    The problem with allocating black is that the new cells can't be
>    collected until the end of the *next* GC cycle, even though they're
>    the most likely to die quickly.
> 
> Right.  So the basic tradeoff here is
> 
> (memory loss by not collecting the set of cells that are most likely
> to have died young)
> 
> versus
> 
> (performance advantages of doing GC asynchronously on a dedicated
> thread rather than "in-line" on program threads).

If you're assuming that one must allocate black if one GC's on a
dedicated thread, that's not true.  You can GC in a dedicated thread,
but still allocate white sometimes.


> That would be very cool, but, if GC is asynchronous, I don't see how
> it can be implemented efficiently, i.e. without requiring a mutex
> acquisition inside SCM_NEWCELL.

I'm not convinced that grabbing a mutex is that much more expensive
than grabbing thread-specific data.  Either one involves some kind of
call to a thread library routine.

References:
- generational GC
  - From: Michael Livshin
- GC and threads (was Re: generational GC)
  - From: Neil Jerram
- Re: GC and threads (was Re: generational GC)
  - From: Jim Blandy
- Re: GC and threads (was Re: generational GC)
  - From: Neil Jerram

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]