This is the mail archive of the
cygwin-apps
mailing list for the Cygwin project.
Re: [patch/rebase] Add a rebase database to keep track of DLL addresses
On 7/23/2011 6:11 AM, Corinna Vinschen wrote:
> There's definitely a bug in the Mingw code, though. I reviewed the
> patch, comments inline.
:-(
>> I don't know why, but when I added O_BINARY everything was copacetic.
>
> LF->CRLF conversion?
Well, I guess -- but there aren't (or shouldn't be) any LFs *or* CRs in
the filenames of DLLs. That's what has me stumped.
read(fd, array[i].name, array[i].name_size)
returned success (or, at least, a non-negative value), but nothing was
stored in array[i].name. That's one reason I switched to calloc for the
name strings; otherwise I was getting random junk (and possible segvs)
when printing the names.
>> When I say quick-n-dirty, I mean: lots of duplicated and only slightly
>> modified code from rebase.c. There's room for code consolidation, so
>> this bit could be put off until later.
>
> Right. Especially all the db file information should go into a new
> header file.
Ack -- but that's for another patch. the Q is, add quick-n-dirty now,
then consolidate and reorganize later -- or reorganize first, and then
(or simultaneous with) adding the new developer's utility. Either way,
doesn't matter to me.
>> +override CXX_LDFLAGS+=@EXTRA_CXX_LDFLAG_OVERRIDES@
>
> Why is CXX_LDFLAGS necessary? I see what you do but I can't imagine the
> msys compiler doesn't know -static-libstdc++.
That's precisely why. msys compiler is 3.4.x which supports only
-static-libgcc.
> I'll ignore the rebase-dump stuff for now.
Ack.
>> + mingw|msys )
>> + # Also exclude the commands we're using below, which is not ideal
>> + /bin/ps -s | /bin/sed -e '1d' | /bin/awk '{print $NF}' |\
>> + /bin/sed -e '/\/bin\/ps$/d' -e '/\/bin\/ps\.exe$/d' \
>> + -e '/\/bin\/sed$/d' -e '/\/bin\/sed\.exe$/d' \
>> + -e '/\/bin\/awk$/d' -e '/\/bin\/awk\.exe$/d' \
>> + -e '/\/bin\/sort$/d' -e '/\/bin\/sort\.exe$/d' \
>> + -e '/\/bin\/uniq$/d' -e '/\/bin\/uniq\.exe$/d' \
>> + -e '/\/bin\/dash$/d' -e '/\/bin\/dash\.exe$/d' |\
>
> What's the reason to do that? I don't see how that should be necessary.
> All the tools are used before peflags is called, so there's no problem
> to change them as well, just as on Cygwin.
At one point, rebaseall had similar code, on cygwin, but then was later
changed to
grep -E -q -i -v '/d?ash(.exe)?$' /proc/[0-9]*/exename
This "new" procedure omitted all the awk/sort/uniq stuff was because it
wasn't USING those tools anymore. Then, the "new" code was copied into
peflagsall.
Since, on mingw|msys, I needed the "old" code in rebaseall (msys doesn't
have proc/*/exename), I simply duplicated the "old" code in peflagsall, too.
However, I think this big sed expression *IS* needed, in both cases.
Here's why:
The point here is to cause the script (rebaseall OR peflagsall) to give
up if ANY cygwin/msys process is active other than [d]ash. The reason
is, in each case, we're going to try to write to every DLL (or every
EXE) in the installation tree (except [d]ash, and
cygwin-1.dll/msys-1.0.dll, and a few others). Since we can't write to
in-use objects, we have to make sure that there AREN'T any in-use
objects -- e.g. no other cygwin/msys processes *except* those that we
"know" about.
Since the procedure, on msys, to detect processes requires at minimum
ps.exe, awk, and dash, we have to 'sed-out' those three applications --
and sed.exe itself -- to avoid 'false positives'. (uniq and sort are
cosmetic, but if we use 'em, then we need to sed them out too).
>> Index: rebase.c
>> ===================================================================
>> RCS file: /cvs/cygwin-apps/rebase/rebase.c,v
>> retrieving revision 1.6
>> diff -u -p -r1.6 rebase.c
>> --- rebase.c 21 Jul 2011 19:10:04 -0000 1.6
>> +++ rebase.c 22 Jul 2011 23:59:09 -0000
>> @@ -50,6 +56,10 @@ FILE *file_list_fopen (const char *file_
>> char *file_list_fgets (char *buf, int size, FILE *file);
>> int file_list_fclose (FILE *file);
>> void version ();
>> +#if defined(__MSYS__)
>> +/* MSYS has no strtoull */
>> +unsigned long long strtoull(const char *, char **, int);
>> +#endif
>
> Does it have strtoll? It should have since the function is available
> in Cygwin since October 2001, which means it was available in Cygwin
> 1.3.4 already. Msys has been forked after that, afaics.
>
> So, if we have strtoll, you could simply use that and cast the result to
> uint64_t, rather than to paste some external strtoull implementation.
No, it doesn't have strtoll:
objdump -p /c/MinGW/msys/1.0/bin/msys-1.0.dll | grep strtoll
Anyway, here's the fork date for msys:
Wed Sep 12 01:03:36 2001 Christopher Faylor <...>
The strto[u]ll stuff was exported from cygwin two weeks later:
Mon Oct 1 14:25:00 2001 Charles Wilson <...>
having been added to newlib the same day:
2001-10-01 Charles Wilson <...>
I'll look into adding that patch to msys-$next, but given how long it
takes for new msys to propagate, we'll still need a workaround for some
time.
>> -#ifdef __CYGWIN__
>> +#if defined(__MSYS__)
>> + if (machine == IMAGE_FILE_MACHINE_I386)
>> + {
>> + GetImageInfos64 ("/bin/msys-1.0.dll", NULL,
>> + &cygwin_dll_image_base, &cygwin_dll_image_size);
>> + /* See cygwin code, below */
>> + cygwin_dll_image_base -= 3 * ALLOCATION_SLOT;
>> + cygwin_dll_image_size += 3 * ALLOCATION_SLOT + 8 * ALLOCATION_SLOT;
>
> This is not correct for msys. Msys has been forked off from Cygwin 1.3.
> Back in 1.3 days, the shared memory areas were not allocated in front of
> the DLL. Rather, the default address for the shared memory areas was
> 0xa000000, bottom up. I also doubt that it's really necessary to add the
> slop factor to the end.
OK, that's easy enough to correct.
>> @@ -308,7 +331,8 @@ save_image_info ()
>> hdr.offset = offset;
>> hdr.down_flag = down_flag;
>> hdr.count = img_info_size;
>> - if (write (fd, &hdr, sizeof hdr) < 0)
>> + errno = 0;
>> + if (write (fd, &hdr, sizeof (hdr)) < 0)
>
> Why are you setting errno to 0, if it's only printed if write fails,
> which sets errno anyway?
Leftover from debugging. I was getting a failure and was trying to
figure out why (there was other debugging code mixed in there, which got
deleted). I'll take it out.
>> @@ -480,7 +505,7 @@ load_image_info ()
>> for (i = 0; i < img_info_size; ++i)
>> {
>> img_info_list[i].name = (char *)
>> - malloc (img_info_list[i].name_size);
>> + calloc (img_info_list[i].name_size, sizeof(char));
>
> There's no reason to use calloc, it's overwritten by the subsequent
> read() anyway. This is differnt from the calloc for img_info_list,
> which will only get partially filled by the read call.
No, that's not what happened on mingw.
I was getting a "successful" return (e.g. non-negative return value)
from read(), but *nothing* was written into the .name buffer. So, later
on, when I tried to print the value, I just got random garbage until a
'\0' happened to be found. Sometimes this was just a few chars, but
other times it was beyond the .name_size. If this happened at the last
entry in the array, then it was possible to get a segv.
> What about defining the name of the DLL like this:
>
> #if defined (__MSYS__)
> #define CYGWIN_DLL "/usr/bin/msys-1.0.dll"
> #elif defined (__CYGWIN__)
> #define CYGWIN_DLL "/usr/bin/cygwin1.dll"
> #endif
>
> and just use it subsequently, rather than adding more #if's?
OK.
>> +#else
>> + {
>> + /* Borrow cygwin code for extracting module path, but use ANSI */
>> + char exepath[PATH_MAX];
>
> PATH_MAX? How big is that in native Win32? If it's equivalent to
> MAX_PATH, you don't have to worry about long path prefixes.
limits.h:
#define PATH_MAX 259
stdlib.h:
#define MAX_PATH (260)
>> + char* p = NULL;
>> + char* p2 = NULL;
>> + size_t sz = 0;
>> +
>> + if (!GetModuleFileNameA (NULL, exepath, PATH_MAX))
>> + fprintf (stderr, "%s: can't determine rebase installation path\n",
>> + progname);
>> + p = exepath;
>> + if (strncmp (p, "\\\\?\\", 4)) /* No long path prefix. */
>> + {
>> + if (!strncasecmp (p, "\\\\", 2)) /* UNC */
>> + {
>> + p = strcpy (p, "\\??\\UN");
>
> Hang on. You're adding the native NT path prefix for DOS devices?
> What's that supposed to accomplish, given that the subsequent code
> uses msvcrt functions, which work with Win32 paths?
>
>> + GetModuleFileNameA (NULL, p, PATH_MAX - 6);
>> + *p = 'C';
>
> That...
>
>
>> + }
>> + else
>> + {
>> + p = strcpy (p, "\\??\\");
>
> Same NT path prefix weirdness.
>
>> + GetModuleFileNameA (NULL, p, PATH_MAX - 4);
>
> ...and that won't work. You replaced calls to wcpcpy with calls to
> strcpy. Either use stpcpy, or add strlen(p) to p.
>
> Fortunately the strcpy's are wrong, so the path is just what
> GetModuleFileNameA returned, a plain Win32 path, which is what you need.
> So, in fact you should just use the path returned by GetModuleFileNameA
> and...
>
>> + /* strip off exename and trailing slash */
>
> ..yes, exactly.
OK. All that stuff with NT native paths is so much black magic to me,
which is why I copied it with minimal changes (only what I *thought*
were necessary to switch from wchar to char). Shoulda looked up the
defn for wcpcpy...
>> +#if defined(__MSYS__)
>> +/* implementation adapted from google andriod bionic libc's
>
> s/andriod/android
>
> However, as mentioned above, I'd remove this code and just create a
> small wrapper around the strtoll function.
....except we don't yet have that on msys, either.
>> +ProcessResult=0
>> +case $Platform in
>> + mingw|msys )
>> + # Also exclude the commands we're using below, which is not ideal
>> + /bin/ps -s | /bin/sed -e '1d' | /bin/awk '{print $NF}' |\
>> + /bin/sed -e '/\/bin\/ps$/d' -e '/\/bin\/ps\.exe$/d' \
>> + -e '/\/bin\/sed$/d' -e '/\/bin\/sed\.exe$/d' \
>> + -e '/\/bin\/awk$/d' -e '/\/bin\/awk\.exe$/d' \
>> + -e '/\/bin\/sort$/d' -e '/\/bin\/sort\.exe$/d' \
>> + -e '/\/bin\/uniq$/d' -e '/\/bin\/uniq\.exe$/d' \
>> + -e '/\/bin\/dash$/d' -e '/\/bin\/dash\.exe$/d' |\
>> + sort | uniq | grep -E '.'
>
> Same as in peflagsall.in, I don't see why this should be necessary.
As explained above, it /is/ necessary -- we want to bail if we detect
ANY active cygwin/msys processes, but avoid false positives due to the
commands used to DETECT those processes.
the 'grep /proc/[0-9]*/exename' approach avoids these complications
because it only uses ONE command (grep), and "grep" doesn't yet show up
in the /proc/ area when it is invoked. (Is this a race condition we are
exploiting?)
$ grep -E -i -v '/d?ash(.exe)?$' /proc/[0-9]*/exename
/proc/1044/exename:/usr/bin/cygrunsrv
/proc/224/exename:/usr/bin/ssh-agent
/proc/2804/exename:/usr/bin/cygrunsrv
/proc/2840/exename:/usr/sbin/sshd
/proc/2848/exename:/usr/bin/cygrunsrv
/proc/2876/exename:/usr/sbin/syslogd
/proc/484/exename:/usr/sbin/cygserver
/proc/5212/exename:/usr/bin/mintty
/proc/5452/exename:/usr/bin/bash
See? no 'grep'
Therefore, we don't need to 'sed' out grep itself. So the (new-ish)
cygwin method is much simpler than the (old) cygwin/current msys method.
But the ugliness /is/ necessary.
Thanks for the review.
--
Chuck