This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Hanging at GetModuleFileName in inside_kernel function



On Feb 21, 2006, at 11:31 AM, Dave Korn wrote:


On 21 February 2006 19:06, Peter Rehley wrote:

Hi,

Well, for my particular hang issue cygwin is hanging inside the
inside_kernel function on the GetModuleFileName call.  I tracked this
down by adding debug statements (strace.prntf) until I got to the
point where the debug print before GetModuleFileName would appear and
the ones after it didn't. This is consistent.  Each hang is happening
at this spot.

However, this doesn't explain what is happening, but only where.

I also observed that the times it hung were the only times
inside_kernel was actually called.

I'm still trying to get more information.
Peter

p.s. using cygwin snapshot 1.5.19-20060205.


http://cygwin.com/acronyms#PPAST.

Seriously. Nobody can debug your code by ESP or remote control. We can't
even be sure that what you report is correct if we can't reproduce it.
Dang it. I forgot to include the reference. This is in reference to the hanging issue I mentioned earlier. http://cygwin.com/ml/cygwin/ 2006-01/msg00549.html

Basically, when I use a configure script in a loop, at some point one of the subshells launched will hang and never return. Usually it takes several hours for the script to hang, but when I run another configure script in a different bash window I can get the first script to hang within a few minutes.

When the script hangs it can't be stopped by using ctrl-c, and can't be killed using the cygwin kill. It can be killed using task manager, and it can be resumed using the process program from http:// www.beyondlogic.org/solutions/processutil/processutil.htm.

And this only happens on our dual pentium windows 2000 with sp4 machines. The other windows machines we use never hang. These are windows xp pro sp1, windows xp pro sp2, and windows 2000 sp4 machines.

If you
don't show us your code, we don't even know if you've literally bracketed the
GetModuleFileName call with debug prints or if you've just placed one before
and one after the if...else if .. ladder, in which case maybe it's
strncasematch going wrong.
Here is my modified inside_kernel function. I did have to rearrange the conditional so I could add additional debugging information.

static bool
inside_kernel (CONTEXT *cx)
{
  int res;
  MEMORY_BASIC_INFORMATION m;

  strace.prntf (_STRACE_SYSTEM, NULL, "\tChecking virtual");
  memset (&m, 0, sizeof m);
  if (!VirtualQuery ((LPCVOID) cx->Eip, &m, sizeof m))
    sigproc_printf ("couldn't get memory info, pc %p, %E", cx->Eip);
  strace.prntf (_STRACE_SYSTEM, NULL, "\tDone virtual check");

char *checkdir = (char *) alloca (windows_system_directory_length + 4);
memset (checkdir, 0, sizeof (checkdir));
strace.prntf (_STRACE_SYSTEM, NULL, "\tDone alloca");


# define h ((HMODULE) m.AllocationBase)
/* Apparently Windows 95 can sometimes return bogus addresses from
GetThreadContext. These resolve to a strange allocation base.
These should *never* be treated as interruptible. */
if (!h || m.State != MEM_COMMIT)
{
strace.prntf (_STRACE_SYSTEM, NULL, "\tno h or not MEM_COMMIT");
res = false;
}
else
{
strace.prntf (_STRACE_SYSTEM, NULL, "\tchecking module");
if (h == user_data->hmodule)
{
strace.prntf (_STRACE_SYSTEM, NULL, "\th == user_date->hmodule");
res = true;
}
else
{
strace.prntf (_STRACE_SYSTEM, NULL, "\tchecking getmodulename");
if (!GetModuleFileName (h, checkdir, windows_system_directory_length + 2))
{
strace.prntf (_STRACE_SYSTEM, NULL, "\tGetModuleFileName % d",res);
res = true;
}
else
{
strace.prntf (_STRACE_SYSTEM, NULL, "\tnone of the above");
res = !strncasematch (windows_system_directory, checkdir,
windows_system_directory_length);
}
}
}


  sigproc_printf ("pc %p, h %p, interruptible %d", cx->Eip, h, res);
  strace.prntf (_STRACE_SYSTEM, NULL, "\tDone inside_kernel");
# undef h
  return res;
}


How do you know it hung in the function, rather
than returning from the function and then going wrong, just as a for-instance?
How do you know it's really hung, rather than taking a long time to time-out
querying a no-longer-present network drive or something like that?
It's sortof hung. It won't return even after a few days, but using the process program to resume will let the hung program continue. However, when it resumes it won't print the next debug line. I don't know what happens after that point, but the script continues with no errors.

How do we
know whether something earlier in your code hasn't trashed the contents of
memory so that GetModuleFileName goes off into lala-land?


This is why posting a testcase is worthwhile, and a report that says "Umm it
don't work" is no use at all. What, were you really expecting someone to pipe
up with "Oh, GetModuleFileName just doesn't work, that's well known"?


I mean, ultimately, either Cygwin is calling the function correctly with
valid parameters, in which case it's a bug in windows, or it isn't, in which
case the bug is in cygwin. You should have used some %-specifiers in those
printfs to dump the values of some of the variables, then you might have some
information to go on. Or run the whole thing under a debugger and / see/ where
it actually goes.

I'll check the parameters with the %-specifiers. I've tried gdb already and didn't make any progress with it. When I attach gdb to the hung program gdb hangs too. At least until I resume the attached program. I've tried setting breakpoint at points around areas I think are hang locations, but either the program exits without hitting the breakpoint or I'm in code that I can't step through. In the latest area gdb exits without hitting breakpoints even though I set them on the hang line and the debug statements after the hang.


Thanks for your feedback.
Peter


-- Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple Problem reports: http://cygwin.com/problems.html Documentation: http://cygwin.com/docs.html FAQ: http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]