Avoid collisions between parallel installations of Cygwin

Charles Wilson cygwin@cwilson.fastmail.fm
Wed Oct 21 16:11:00 GMT 2009

Coinna wrote:
> On Oct 21 05:59, Eric Blake wrote:
> Nice.  I created a simple shell script which can do the job.  It even
> allows to replace the cygwin1.dll in /bin on the fly.  My script name of
> choice is "enable-unique-object-names":
>   $ enable-unique-object-names
>   Usage: enable-unique-object-names [show|yes|no] [path-to-cygwin-DLL]
>   $ enable-unique-object-names show
>   Default to /bin/cygwin1.dll
>   yes
>   $ enable-unique-object-names no
>   $ enable-unique-object-names show
>   Default to /bin/cygwin1.dll
>   no
> Do you think this is sufficient for now?

Unfortunately, I don't.  First, because of alignment issues, it's almost
a given that the offset from the start of the string to the data you
want to change will be different in each build of the DLL, if it's just
a global data object.  So at minimum, the script must check the DLL
version (or even the build date/id) of the DLL and make sure they match;
it should not attempt to modify an unrecognized version...

>  In theory, if we use a specific
> layout of the datastructure using unambiguous magic entry strings, there
> wouldn't be a reason to use the .rsrc section at all.

...unless the magic ID strings themselves were also each part of the
struct.  In THAT case, I suppose the offsets between beginning of the
string and the data would depend only on the length of the string, the
packing options (-ms-struct?, pragma pack?) used during the compile, and
the specific alignment requirements of the data -- none of which would

> The DLL could
> simply read the values from a global datastructure in the .data section.
> That would speed up the code in init_installation_root() a lot since it
> won't need this FindResource/LoadResource/LockResource stuff.

Hmm.  Well, that IS a benefit -- can we quantify how much of one?

Using a global data struct for configuration settings that a fixed at
compile time, but theoretically mutable by an outside manipulation tool,
just seems...wrong, to me somehow.  It seems that a resource section is
exactly the Right Place for this -- even if we put a big blinking
warning comment in the source code about "Don't Even Think About Using
BeginUpdateResource/UpdateResource/EndUpdateResource with this data".

**IF** the speed benefit of global data struct vs. resource section is
not that large, THEN perhaps the best choice is a combined approach:
  1) the data is persistently stored in the .rsrc section
  2) cygwin, on initial load in a particular process tree, retrieves the
data and populates a global [*] struct. THAT struct is what is used by
all the rest of the cygwin functions -- so there's only one
FindResource/LoadResource/LockResource episode per process tree.
  3) An external tool can be used to manipulate the .rsrc section in
exactly the way you describe -- by brute force, as it were -- IF we take
care of a few items (below).

[*] how do you access a "global" data struct if the names of global
items are keyed by the value of that data struct?  Well, this need not
be a TRULY global data object.  Instead, it's just in the process's
memory space, and gets "copied" down to direct children in the same way
other data items are.  Each independently started cygwin process --
those launched by windows or double-clicking or whatever -- would have
to go thru the FindResource/LoadResource/LockResource  dance to populate
the initial version of that struct. So
  a) how often does that happen, compared to *cygwin* processes
launching other cygwin processes?
  b) how MUCH slower is it? Obviously loading initialized data from the
.rdata section in a DLL is almost instantaneous, so it will be somewhat
slower. Is this quantified, in some ordinary use case?

The big concern about brute force manipulation of binary (packed?) data
structs is keeping track of the offsets, and the data sizes of each
element -- whether they are in the .rsrc section or anywhere else. Using
explicit unique string keys bloats the structure unnecessarily if the
number or kind of data stored grows. This isn't a concern for the
Windows on-disk registry which uses that scheme, but ours requires
swapped-in RAM storage.

I hesitate to suggest Yet Another Addition before 1.7.1 -- and cgf is
gonna kill me when he gets back and reads this, whether we take action
on it or not -- but there is a solution.  Cygwin will never directly
support RPC, but there's no reason it can't implement the XDR primitives
-- especially now that SUN has changed the license from (open but
non-copyleft and arguably GPL-incompatible) to BSD.

As documented in the libtirpc rpm.spec file from Fedora 11:
* Tue May 19 2009 Tom "spot" Callaway <xxx@redhat.com> 0.1.10-7
- Replace the Sun RPC license with the BSD license, with the explicit
  permission of Sun Microsystems
Taking *SUN'S* code and modifing it -- to change the license to BSD --
requires direct permission from Sun for each and every instance (it was
not a global license grant like the old UCalBerk BSD advert clause
thing).  BUT, since Red Hat has already done that legwork for Fedora's
libtirpc, I can take THAT code from libtirpc under ITS now BSD license,
and do whatever I want with it commensurate with BSD...

Using XDR, we can explicitly define the binary format of the data in the
.rsrc section. As far as windows is concerned, the "resource" is a big
ugly binary blob of length N. But, we can parse that blob inside cygwin
using XDR primitives, and our external tool can parse/read/write it just
as easily.

I've taken the XDR stuff from Fedore 11's libtirpc, and created a
distribution that
  a) builds independently only the XDR bits
  b) compiles cleanly for mingw (important if used by cygcheck), msvc,
cygwin, solaris, linux32, and linux64
  c) has a large test suite that exercises every function, and
demonstrates how to use the XDR primitives to store complicated data
structures.  I believe this is unique among all XDR implementations.
Most build it as part of an RPC solution -- and then only explicitly
test RPC, figuring that if RPC works, then the underlying XDR must as
well.  The LGPL'ed re-implemetationm, portable XDR, had no tests at all
last time I looked.

I've had this since May, but had been holding off until after 1.7.1 to
even suggest that cygwin incorporate this new functionality (and it
needn't take my implementation; "inside" the cygwin kernel or newlib you
could easily just use the libtirpc version almost directly, but it
doesn't play nicely if a 'long' is 64bits AFAICT). It would take a
little -- not much -- work to integrate the core code of either
implementation into cygwin or newlib, but it's not impossible.  And
using a documented interchange format is PRECISELY the Right Thing To Do
when talking about the interchange and manipulation of binary data blobs
(e.g. between cygwin1.dll and a native w32 manipulation tool, such as

But having said all of that...

This is really turning in to the "little old lady who swallowed the
fly".  We have a new function, that globally affects cygwin and must be
set VERY early in the process load sequence.  We want to be able to turn
it off, if necessary -- but leave it on by default.  We've considered an
external resource file, and .rsrc sections -- and at least to me, the
.rsrc section (or at least, in-the-binary) has significant advantages.

We'd LIKE to be able to "switch" its mode without recompiling, and this
last bit is the tail wagging the dog here.  Sure, a general approach to
.rsrc-style resources in the cygwin1.dll is nice -- but is it necessary?
 Sure, XDR would be great to have in the kernel -- but is it wise, this
"close" to the release, to send the XDR cat after the Mutable mouse
after the .rsrc spider after the disable-unique-object-names fly?


More information about the Cygwin-developers mailing list