This is the mail archive of the
mailing list for the Cygwin project.
RE: Regression for OCaml introduced by rebase 4.4.4
- From: David Allsopp <David dot Allsopp at cl dot cam dot ac dot uk>
- To: "cygwin at cygwin dot com" <cygwin at cygwin dot com>
- Date: Fri, 9 Feb 2018 11:29:05 +0000
- Subject: RE: Regression for OCaml introduced by rebase 4.4.4
- Authentication-results: sourceware.org; auth=none
- References: <firstname.lastname@example.org> <20180208151549.GA32555@calimero.vinschen.de>
Corinna Vinschen wrote:
> On Feb 8 11:47, David Allsopp wrote:
> > TL;DR flexlink-compiled DLLs (i.e. ocaml libraries) are broken by the
> > 0x200000000 base address requirement added in rebase 4.4.4. Possible
> > fixes for this at the bottom.
> > [...]
> > $ ocaml
> > OCaml version 4.04.2
> > # #load "unix.cma";;
> > Cannot load required shared library dllunix.
> > Reason: /usr/lib/ocaml/stublibs/dllunix.so: flexdll error: cannot
> > relocate RELOC_REL32, target is too far: 0xfffffffc013d8b5f
> > This is a known problem and fundamental limitation of flexdll (there
> > is no
> > RELOC_REL64 in COFF).
> Apart from that, not only Cygwin DLLs but also the Windows system DLLs
> are all loaded and relocated to the area beyond 0x1:80000000, so
> relocation beyond the 32 bit address space is no generic problem in
> Windows. Why isn't that possible in FlexDLL? I don't understand this.
> To me this looks like a bug in FlexDLL, not a requirement to let certain
> DLLs slip through the cracks.
There's a more full explanation of what and why for flexdll here: https://github.com/alainfrisch/flexdll/blob/master/README.md. I believe it's not unrelated to some of the black magic going on in Cygwin's autoload.cc, but without (at least at the moment), quite as much self-modifying code.
FlexDLL is "solving" the problem of allowing a dynamically loaded library to refer to symbols in the main application (or in previously dynamically loaded libraries, without loading them a second time, as the Windows loader I believe does). FlexDLL does this by deferring COFF relocations to runtime and it achieves that by sitting in front of both the linker when the DLL is constructed and also an application's main (or dllmain). For normal linking, since PE limits code size to 2GB, there is no need for a RELOC_REL64 relocation type. However, because we're actually resolving the symbols dynamically, on 64-bit the DLL may have been loaded too far from the executable (or other DLL) image it's resolving to (for actual Windows resolution to DLL symbols, you'd be using the stub code generated either by the linker or by __declspec(dllimport), which would similarly be guaranteed to be within the range of RELOC_REL32 because the stub itself is static).
When this was originally encountered for 64-bit MSVC (this was all added before Cygwin64 existed), the solution at the time was to keep the preferred base addresses low, but in reality what's really required is that everything is within a 2GB window somewhere in the address space.
I guess one can argue over whether that's a bug or a limitation, but the problem we face is that we can engineer it so that our DLLs and executables are within a 2GB range (having looked again at this in even more detail, we could just as readily do this with addresses > 0x200000000), but we still run the risk of rebase messing up the DLLs.
However, we'll scratch our heads some more on possible alternative solutions, since having a flag for DLLs which says "keep us within a 2GB range somewhere" sounds even more less likely to get merged than my previous suggestion.