This is the mail archive of the binutils@sources.redhat.com mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Loadable modules for XEmacs on Windows


I am working on loadable module support for XEmacs, and am having a
problem with Windows support that has led me to plead for help from the
experts.  Pardon me for the length of this message, but I think I need
to tell you exactly what I have done and why for it all to make sense.
It might also help to know that I'm a Unix programmer making his first
foray into the world of Windows.  That probably explains a lot.

Loadable modules are C code that extend XEmacs in some way that is
unavailable in Lisp, often by making some existing library of C code
visible at the Lisp level.  In the Unix world, we use --export-dynamic
to make all symbols in the XEmacs executable visible to the modules.
This is because the binary should be able to run with zero modules.
They provide extra functionality, not required functionality.  Hence,
XEmacs provides the basics (e.g., the Lisp engine) used by the modules,
so modules need to resolve symbols against the executable, but not the
other way around (that is always done with an explicit lookup, e.g.,
dlsym or GetProcAddress).

We don't really want to expose ALL of the internals of XEmacs to the
modules.  There is a specific API (under construction) to be used for
that purpose.  It would be better if the modules could only see that
specific set of symbols.  In the Unix world, we don't worry about this
because of the various executable formats.  If we only supported ELF
targets, we could use the VERSION trick described in the ld
documentation.  However, we have no strategy for dealing with non-ELF
targets.  In the Windows world, I want to try limiting us to that
specific list, if possible.  (If it turns out not to be possible, oh
well, but we really should try.)

Finally, whatever approach we take should let us keep as much in common
as possible between the various Windows targets (native, Cygwin, Mingw,
etc.), and perturb the existing code as little as possible.  I decided
to try tackling Cygwin first.  I read the Cygwin DLL documentation, the
dlltool man page, and much of the ld and ldint info files.

I decided to try building a small sample of what I want to get the hang
of the techniques involved, before tackling the XEmacs giant.  The idea
is to have a small executable with a function or two to be used as a
base, and a small DLL that has a function to be looked up by the
executable, and that uses one or two functions in the base of the
executable.  The structure is as follows:

test.c:
  main: opens the DLL with LoadLibrary, finds and executes a function
        named "dllfunc" in that library.  Also calls stupid_mutator,
        just to have a dependency on test2.c.
  reverse_string: a function that does what the name implies

test2.c:
  stupid_function: a function that returns the value of a global
                   variable 
  stupid_mutator: a function that changes the same global variable

my.c:
  Contains a function dllfunc that does little but call reverse_string
  and stupid_function.

Given the desiderata above, it made sense to me to avoid explicitly
marking things with declspec(dllexport) and use a DEF file instead.
That way, I don't have to make extensive changes to the XEmacs code.
The sample code is available at
<URL:http://www.ittc.ku.edu/~james/xemacs/dlltest.tar> if anybody wants
to see it.

Attempt #1: The test code that didn't quite work

The test.def says that it exports reverse_string and stupid_function.
Here are the relevant rules from the Makefile:

libtest-import.a libtest-export.a: test.def
	dlltool --input-def $< --output-lib libtest-import.a --output-exp libtest-export.a

test.exe: test.o test2.o libtest-export.a
	gcc -o $@ test.o test2.o libtest-export.a

my.dll: my.o libtest-import.a
	gcc -o $@ -shared my.o libtest-import.a

This attempt compiles and links cleanly, even with all warnings turned on.
However, when it runs, LoadLibrary fails with error code 126.

Attempt #2: The test code that worked

After scratching my head for awhile, I used the tried and true approach
of making a random change to the Makefile, namely this:

libtest-import.a libtest-export.a: test.def
	dlltool --input-def $< --output-lib $@

libtest-export.a: test.def
	dlltool --input-def $< --output-exp $@

And that works!  Now the executable starts up properly, calls the DLL
function, which calls the executable functions again.  We're in
business!  But why on earth does this work when #1 does not?

Attempt #3: The XEmacs code that failed to link

I then tried to implement approach #2 for XEmacs.  The actual patch
(against current CVS, but it will probably apply to 21.5.14) is here:
<URL:http://www.ittc.ku.edu/~james/xemacs/xemacs1.patch>.  The XEmacs
executable builds and seems to work fine.  However, we've got a little
problem with linking the PostgreSQL module.  We get a bunch of messages
that look like this:

Info: resolving _Qnil by linking to __imp__Qnil (auto-import)

What's this about auto-import?  I didn't ask for that, did I?  Oh, okay,
I see in the middle of the "`ld' and WIN32 (cygwin/mingw)" page that
auto-import is the default for Cygwin/Mingw platforms.  I missed that
the first time around somehow.  But then we see lines that look like
this:

fu000002.o(.idata$3+0xc): undefined reference to `_xemacs_import_a_iname'

and then a bunch of lines that look like this:

nmth000000.o(.idata$4+0x0): undefined reference to `__nm__Qnil'

None of these .o files are part of XEmacs, so it looks like dlltool is
generating an import library containing references to symbols that the
library itself does not define.

Attempt #4: The XEmacs code with --whole-archive

Or maybe the import library does define those symbols but, being an
archive, we didn't pluck out all the pieces we really need.  Let's try
wrapping --whole-archive ... --no-whole-archive around the import
library, as suggested by <URL:http://cygwin.com/cygwin-ug-net/dll.html>.
Nope, same result.

Attempt #5: Test code, but where is the implib?

The ld docs say that the *_iname symbols are internal to the structure
of the DLL.  I don't see the __nm__* symbols mentioned, but a search
finds <URL:http://sources.redhat.com/ml/binutils/2001-08/msg00081.html>
(see part 2.3).  This says that my problem is that I should not be using
dlltool to create the import library with auto-import turned on.  So why
did attempt #2 work?  And why didn't the docs warn me that I shouldn't
be using dlltool?  Even the Cygwin docs mention the use of dlltool.  The
ld man page itself mentions dlltool in several places.  But, okay, let's
see if we can get rid of dlltool.

I see that ld has an --out-implib option to output an import library.
But I don't see an option to have it create an export library on the fly
for the object it is creating.  However, the docs do say that you can
include a .DEF file on the link line, so I guess that must be what I'm
looking for.  The docs are kind of unclear about that, though.  Let's
give this a shot with our test code first.  We'll change the Makefile
like so:

test.exe libtest-import.a: test.def test.o test2.o
	gcc -o $@ $^ -Wl,--out-implib,libtest-import.a

my.dll: my.o test.exe
	gcc -o $@ -shared my.o libtest-import.a

Compiling of test.exe works fine, but no libtest-import.a is emitted.  A
little experimentation shows that ld creates the import library when it
is asked to create a DLL, but silently fails to create one when it is
asked to create an executable.  I found this behavior kind of puzzling,
so I downloaded the binutils sources and looked through them to try to
find out what is going on.  Cygwin is currently on binutils-2.13, but I
downloaded the latest sources anyway, just in case this has been
addressed already.  In binutils-2.14/ld/emultempl/pe.em, function
gld_${EMULATION_NAME}_finish(), is this code:

#ifdef DLL_SUPPORT
  if (link_info.shared
#if !defined(TARGET_IS_shpe) && !defined(TARGET_IS_mipspe)
    || (!link_info.relocateable && pe_def_file->num_exports != 0)
#endif
    )
    {
      pe_dll_fill_sections (output_bfd, &link_info);
      if (pe_implib_filename)
	pe_dll_generate_implib (pe_def_file, pe_implib_filename);
    }

So, sure enough, unless link_info.shared is set (you are linking a
shared object), the import library request is not even glanced at.  You
get no warning message and no import library.  Is there some technical
reason why an import library cannot be generated when linking an
executable, or is this just due to somebody's lack of imagination?  Who
knows, but I need a solution that works on today's Cygwin, not
tomorrow's where this has been addressed.

Attempt #6: The test code with a useless DLL

So let's create both the actual executable, and a fake DLL that is only
used to trick ld into creating an import library.  Change the Makefile
like this:

test.exe: test.def test.o test2.o
	gcc -o $@ $^

libtest-import.a: test.def test.o test2.o
	gcc -shared -o useless.dll $^ -Wl,--out-implib,libtest-import.a
	rm -f useless.dll

my.dll: my.o libtest-import.a
	gcc -o $@ -shared my.o libtest-import.a

Running the resulting application causes Windows to pop up a box that
says:

-------------------------------------------------------------------------------
| test.exe - Application Error                                              X |
|                                                                             |
| X  The application failed to initialize properlyl (0xc0000018).  Click on   |
|    OK to terminate the application.                                         |
|                                                                             |
|                               OK                                            |
-------------------------------------------------------------------------------

Running it under GDB says:

warning: LDR: LdrRelocateImageWithBias() failed 0xc0000018
LDR: OldBase     : 00000000
LDR: NewBase     : 00010000
LDR: Diff        : 0x77f5bcd40028fadc
LDR: NextOffset  : 00000000
LDR: *NextOffset : 0x0
LDR: SizeOfBlock : 0x10000


Program received signal SIGSEGV, Segmentation fault.

Program received signal SIGSEGV, Segmentation fault.

Program exited with code 0200.
You can't do that without a process to debug.


Well, that obviously didn't work!  So what have I learned so far?

1) If you use dlltool to generate both an export library and an import
   library, don't generate them both at the same time.  Use different
   steps.  I don't know why.  The dlltool code itself contains an
   example at the top showing them both generated together.  It doesn't
   work for my example.

2) Auto-import and dlltool don't mix, even though the docs don't say
   so, which is a problem since auto-import is the default.

3) LD refuses to generate an import library when producing an
   executable.  You could use dlltool to generate the import library,
   but see #2.

4) You can't trick ld into generating an import library for you by
   asking it to create an unneeded DLL, because the resulting executable
   contains relocations that cannot be fixed up at runtime.

Which leaves me with no workable solution, apparently.  Now what?
Should I try turning off auto-import?

Regards,
-- 
Jerry James
http://www.ittc.ku.edu/~james/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]