Program randomly hangs after upgrading to cygwin 1.5.7

Volker Quetschke quetschke@scytek.de
Thu Feb 5 03:47:00 GMT 2004


Hi!

>> #!/bin/sh
>> strace -o /tmp/dmake-trace-$$ --flush-period=10 dmake $*
> 
> Nice idea, I did something similar yesterday evening. I had to only keep
> the last strace output, because each file was in the order of 50MB to
> 500MB and I implemented a counter to get an idea for the number of
> started dmake processes. At the moment we are at 580, it is still
> running.

I caught one hanging process, but the strace is weird. First of all, in
this case not dmake is hanging, but a tcsh process which is started by
the dmake process. See the end of the following unsolicited strace log
where it hangs. But the really weird thing are the ~500MB zeros
in that strace log. See this
   <http://qa.openoffice.org/issues/show_bug.cgi?id=25037>
Openoffice issue for the compressed complete strace until the freeze.

Here comes the end of the strace:

<snip>
    76 4083280 [main] tcsh 1336 wait4: calling proc_subproc, pid -1, 
options 1
   144 4083424 [main] tcsh 1336 proc_subproc: args: 4, 1628375144
    77 4083501 [main] tcsh 1336 proc_subproc: wval->pid -1, wval->options 1
    72 4083573 [main] tcsh 1336 checkstate: nchildren 0, nzombies 0
    78 4083651 [main] tcsh 1336 checkstate: checking alive children
    77 4083728 [main] tcsh 1336 checkstate: returning 0
    71 4083799 [main] tcsh 1336 proc_subproc: waiting thread found no 
children
    94 4083893 [main] tcsh 1336 proc_subproc: finished processing
terminated/stopped child
    72 4083965 [main] tcsh 1336 proc_subproc: returning 1
    69 4084034 [main] tcsh 1336 wait4: intpid -1, status 0x22694C, 
w->status 0,
options 1, res -1
    82 4084116 [main] tcsh 1336 wait4: *** errno = 10
    70 4084186 [main] tcsh 1336 set_signal_mask: oldmask 0x80002, 
newmask 0x2,
mask_bits 0x80000
    75 4084261 [main] tcsh 1336 sig_send: returning 0x0 from sending 
signal -33
    96 4084357 [main] tcsh 1336 handle_sigsuspend: oldmask 0x80002, 
newmask 0x2
42874012 46958369 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900212535 947170904 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900286982 1847457886 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900318304 -1547191106 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900220341 -646970765 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900171768 253201003 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900133829 1153334832 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900124349 2053459181 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900158918 -1341349197 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900151800 -441197397 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
358986345 -82211052 [win] tcsh 1336 wndproc 537 unknown  7 0
  7273 -82203779 [win] tcsh 1336 wndproc 537 unknown  7 0
38318101 -43885678 [win] tcsh 1336 wndproc 537 unknown  7 0
502845672 458959994 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
126132228 585092222 [win] tcsh 1336 wndproc 537 unknown  7 0
  2034 585094256 [win] tcsh 1336 wndproc 537 unknown  7 0
774006971 1359101227 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
847905325 -2087960744 [win] tcsh 1336 wndproc 537 unknown  7 0
52207351 -2035753393 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900501277 -1135252116 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900135003 -235117113 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900157738 665040625 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900145818 1565186443 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900164881 -1829615972 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900171339 -929444633 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900154244 -29290389 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900124088 870833699 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0
900134268 1770967967 [win] tcsh 1336 wndproc 30 WM_TIMECHANGE 0 0

*here are ~500MB zeros, really! Something is wrong*

900213069 142414857 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900286987 1042701844 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900317803 1943019647 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900220842 -1451726807 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900171780 -551555027 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900133317 348578290 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900124845 1248703135 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900158936 -2146105225 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
900151263 -1245953962 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
358987187 -886966775 [win] tcsh 1584 wndproc 537 unknown  7 0
  6934 -886959841 [win] tcsh 1584 wndproc 537 unknown  7 0
38318117 -848641724 [win] tcsh 1584 wndproc 537 unknown  7 0
502845669 -345796055 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0
126132235 -219663820 [win] tcsh 1584 wndproc 537 unknown  7 0
  2016 -219661804 [win] tcsh 1584 wndproc 537 unknown  7 0
774006983 554345179 [win] tcsh 1584 wndproc 30 WM_TIMECHANGE 0 0


No clue how to read this.

>> Also if you compile dmake with debugging (gcc -g) you might then be 
>> able to
>> attach with CYGWIN=notty gdb --pid=<dmake-pid>.

I still cannot attach to any of the processes involved.

Volker

-- 
PGP/GPG key  (ID: 0x9F8A785D)  available  from  wwwkeys.de.pgp.net
key-fingerprint 550D F17E B082 A3E9 F913  9E53 3D35 C9BA 9F8A 785D
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 254 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin/attachments/20040205/dde01659/attachment.sig>


More information about the Cygwin mailing list