This is the mail archive of the guile@cygnus.com mailing list for the guile project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

unexec -> shared lib


Here's a plan I'm in the process of devising for combining the best of
shared libraries and unexec.  Comments/suggestions/flames are
welcome.  My system uses ELF binaries, so the discussion is admittedly
ELF-biased.

As emacs is born of temacs, so shall liburf.so spring from libturf.so.
(urf stands for "unexecked, relocatable fur"; fur stands for
"flexible, ultranifty rfu"; rfu stands for "really fast urf")

Our program which loads and dumps libturf.so shall be called, umm,
Turf ("[T]ool for making [URF]s").  Turf requires of libturf.so an
entry point: a function which will initialize the structures to be
frozen in liburf.so.  For the sake of discussion, let's call the entry
point urf_init().

So turf starts up, dlopens libturf.so, and calls urf_init().  Our
library has initialized itself.  Problem: libturf.so's uninitialized
and initialized data segments are over here in memory, and all the
malloc'ed stuff is way over there.  Well, okay, we take note of their
locations and store the data away in files somewhere.  Call the files
data1 and heap1, respectively.

Now comes the beauteous part.  Turf runs again, this time loading
libturf.so at a *different address* (it could load another shared
library before libturf.so, for example).  It also moves the heap by
(for simplicity's sake) the same offset by malloc'ing (or, more
likely, sbrk'ing) a chunk of never-to-be-used memory.  Then it calls
urf_init() and dumps as before, this time naming the files data2 and
heap2.

(For this to work, urf_init() should not store any information (other
than memory addresses) which might change from one run to the next
(such as the time of day).)

(No doubt the difference between the two load addresses must be chosen
with care.  Emacs, for example, seems to allocate some things on 64K
boundaries; at any rate, I'm having less trouble with emacs when delta
is a multiple of 64K than when it isn't.)

Now, if all goes according to plan, data1 and data2 have the same file
size, as do heap1 and heap2.  What's more, the contents of data1
differ from those of data2 only in very specific ways, and likewise
heap1 and heap2.  The only differences are the pointers.  Wherever
sizeof(char *) bytes in data1 represent a pointer to something, the
corresponding sizeof(char *) bytes in data2 will contain the same
value shifted by the delta separating the load addresses of libturf.so
in Run 1 and Run 2.  Similarly for heap1 and heap2.

(This plan falls apart if programs do really wicked things like
hashing or bit-rotating the value of a pointer.  Storing flags in the
low two bits, however, should not upset things.)

Then you make an object file out of libturf.so+data1+heap1 with
relocations where all the pointers are.  (This part needs some
fleshing out...)

I'll let folks know when I get anything resembling this to work.  It
may be a while, though, for I expect to encounter dragons along the
way.

Also, if anyone has done this or something along these lines, I'd love
to hear about it!

-John