Hang with 20051018 (3rd version) snapshot while building OOo

Volker Quetschke quetschke@scytek.de
Thu Oct 20 12:44:00 GMT 2005


Volker Quetschke wrote:
> Christopher Faylor wrote:
>>On Wed, Oct 19, 2005 at 03:45:30PM -0400, Volker Quetschke wrote:
>>(snip)
>>Given the number of changes that have been made to cygwin, particularly
>>in /proc handling, it's very difficult for me to believe that you are
>>not seeing *any* differences in behavior and
Well, there are differences in the frequency of occurrence of the hangs.

>> I'm wondering if you're
>>actually seeing what you think you're seeing, i.e., I'm wondering if the
>>process is just timing out and you are attributing it coming "unstuck"
>>to the fact that you're doing "ls /proc/*/fd".  I can't see any reason
>>why inspecting /proc should cause any kind of special behavior in the
>>latest snapshots since /proc handling now occurs in its own thread.
> 
> I can completely understand your worries. My problem is that I cannot
> reproduce the problem myself and all I can do is ask the people who
> have this problem to try get some debug information.
> 
> I just asked for a confirmation that it really is the "ls /proc/*/fd"
> that "unstucks" the process. I don't believe that "/usr/bin/tcsh -fc pwd"
> needs a long time to finish so that we're getting a coincidence there.
I got some information back:
It is done like this, the build is running/hanging in one shell (1).

When it hangs, start a new tcsh shell (2) and get the ps and cygcheck
information. Then open a new bash (3) and start "strace -p <pidhang>"
Now in (2) start
		while 1
			ls /proc/<pidhang>/fd
		end
until the strace is ready.

Some details: The build is running on a local NTFS drive. It's a dedicated
machine, not much is running beside the build.

He wrote that 20051019 also produced a hang and that I'll get the next ;)
strace.

Clueless

      Volker


> Having said that, I never realized that before, maybe the problem really
> lies in this special command. I mean due to some historic quirks every
> makefile in the OOo tree has a line that sets a macro to the current path
> using that command, but there are still lots of other commands (also executed
> in a tcsh shell) in these makefiles that I never heard of to hang.
> (I'll also verify that what I just said is really true, it's just an idea.)
> 
> 
>>I could almost convince myself that there was a race in /proc handling
>>before but I could never convince myself that doing something like "ls /proc/*/fd"
>>would have any effect on it.  Nevertheless, I did make some changes to
>>eliminate the potential source of hangs in this code.  So, I can't
>>understand why you wouldn't see something different.
> 
> 
> I have no clue either, especially as I also cannot reproduce and therefore
> cannot pinpoint the problem. :(
> 
> Anyway, thanks for all your efforts!
> 
>    Volker
> 


-- 
PGP/GPG key  (ID: 0x9F8A785D)  available  from  wwwkeys.de.pgp.net
key-fingerprint 550D F17E B082 A3E9 F913  9E53 3D35 C9BA 9F8A 785D
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 254 bytes
Desc: OpenPGP digital signature
URL: <http://cygwin.com/pipermail/cygwin/attachments/20051020/1e0bf572/attachment.sig>


More information about the Cygwin mailing list