This is the mail archive of the guile@cygnus.com mailing list for the guile project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Jim Blandy <jimb@red-bean.com> writes: > > > This will likely depend upon memory protection services in the host > > > os, but any os that doesn't provide this probably isn't running > > > guile in the first place. > > > > This is bad news for us, if it catches on. > > > > We are putting guile to work in our router product, and that has no > > fancy memory protection. Don't forget embedded systems, even if we are > > only a few (one?) currently. > > > Here's an excerpt from a message I just sent Greg Harvey. I have some > tentative problems with using memory protection: > > ----- > > I actually think that memory protection will be much slower than doing it > all in software. Here are the steps involved in handling a page > fault: > > 1) The kernel gets the page fault, puts together a signal context, > and invokes the handler. > 2) The handler does the bookkeeping which is the real point of the > exercise. > 3) Now the handler has to actually do the write. In order for > that write to succeed without causing yet another page fault, > you have to somehow: > a) unprotect the page > b) do the write > c) re-protect the page, to catch the next write > > So you're talking signal handling overhead, and at least two system > calls. Notice also that the handler can't really find the SCM object > being modified; it only gets the address being written to, and > probably the value being written to it. But you can't find the cell > at the head of the vector, given the address of some word in the > malloc'ed array of elements. > > Suppose we tell people that they must call SCM_WRITE_BARRIER (obj, > new) after changing any field in OBJ to point to NEW. So, for > example: > > SCM_VELTS (vector)[5] = some_object; > scm_write_barrier (vector, some_object); > > The thing is, they *know* when they write the code that they're going to > do a write, so they might as well call the write barrier directly. > Then you've got function call overhead (which is less than signal > dispatch overhead) and bookkeeping. No system calls. And you also > get the information you actually want: the object being modified. So > you can check the generations of the objects, and if vector is older > than some_object, then you stick vector in a list of objects to be > used as roots when collecting younger generations. > > The only weirdness here is: what if someone doesn't know they're > mutating an object? What if they've just got a pointer to an SCM, and > they're storing something in it? What if they've just got a void *, > and they're copying bytes from another void * into it? I think the > answer here is that whoever gave them that pointer must have had the > real SCM, and it's the SCM holder's job to make sure the write barrier > gets called appropriately. Thus: > > frob (SCM_CDRLOC (pair)); > scm_write_barrier (pair, SCM_NEWEST); > > SCM_NEWEST would be a magic value which tells the write barrier, "I > don't know what I stored; I may or may not have pointers to younger > objects now." > Here are the relevant bits from my response (dueling emails ;). [on the performance] Actually, I've been thinking that the best way to deal with this on the heap level is to only trap on one fault, set a dirty bit, and unprotect the memory (I'm not even sure that it's possible to get the value that we were trying to write without getting really gross, I'd just assumed it was possible when I did the notes). It does mean scanning through the info on that heap when collecting, but since the new heap segments are going to be very small, it's a short romp through an array with bit checking, really. It still loses on user objects, though. In truth, as I've started actual work on the gc, I'm much less happy with using a write barrier, and it might not be necessary (see below). [later, on the write barrier in general] The thing I don't like about this is that it starts putting the same sort of requirements on the c authors as keeping track of roots by hand, and requires the programmer to know and understand the details of the gc intimately. In general, I feel it should be non-obtrusive if there's another way. I'm starting to lean more and more towards doing a full scan (which would be going in there as an option, anyway, for systems where the memory protection doesn't exist/work the way it should), particularly since my work so far with the current gc showed a dramatic speed increase just by reducing the size of heap segments (the reported gc-time is from 2.5 to 5 times smaller on startup, the smaller improvements come once you get into loading a config file, since it ends up with as much sweeping work as the current collector at that point, though it never lost the initial benefits on the stress tests I threw at it... used more memory, though). If you toss in only sweeping a small portion of existing segments on most gcs, it's seeming less and less likely to me that the additional memory and overhead of keeping track of the intergenerational pointers will be a great advantage over scanning the set of live objects, because the sweep phase really seems to be where we can make up the most ground. In fact, with the improvements that I'm making to the current gc, it might not even make a lot of sense to go completely generational at all, just age the segments according to how times an object has survived collection there, to give the gc a few more hints on which segments to sweep. -- Greg