ssh-add -l hangs under cygwin test 3.6.0-0.139.g...

Takashi Yano takashi.yano@nifty.ne.jp
Mon Jul 1 10:04:50 GMT 2024


On Mon, 1 Jul 2024 17:43:53 +0900
Takashi Yano wrote:
> On Sun, 30 Jun 2024 22:55:22 +0900
> Takashi Yano wrote:
> > On Sun, 30 Jun 2024 20:33:19 +0900
> > jojelino wrote:
> > > On 6/29/2024 2:39 PM, Brian Inglis via Cygwin wrote:
> > > > Reran cygport --debug upload and command hanging was ssh-add -l!
> > > > 
> > >    296   72109 [main] ssh-add 63275 win32env_to_cygenv: 0xA000232E0: 
> > > TERM_PROGRAM=mintty
> > >    189   72298 [main] ssh-add 63275 win32env_to_cygenv: 0xA00023300: 
> > > TERM_PROGRAM_VERSION=3.7.1
> > > 
> > > I was able to reproduce this problem by entering below command with 
> > > ffmpeg from https://www.gyan.dev/ffmpeg/builds/ , this ffmpeg build 
> > > spams putc. So, without piping its output to `tee', It would not 
> > > possible track down any cause among lengthy trace output.
> > > 
> > > strace mintty -e /bin/timeout 10 sh -c './ffmpeg -h full|& tee'
> > > 
> > > In summary, When `timeout' expires, `timeout' signals SIGHUP to pgrp of 
> > > itself, btw some member of the pgrp may have acquired any of 
> > > synchronization object of some part of cygwin internal when a member 
> > > process of the pgrp did encounter the signal interrupt from `timeout'. 
> > > In my case it was output_mutex of pty.
> > > 
> > >   1565 10693297 [main] timeout 745 kill_pgrp: pid 0, signal 15
> > >   2701 15224348 [main] mintty 744 
> > > fhandler_pty_master::process_slave_output: bytes read 256
> > >   1736 9525442 [sig] sh 746 proc_subproc: args: 4, 1
> > >   3137 6700294 [main] tee 748 fhandler_pty_slave::write: pty5, 
> > > write(0x7FFFFC780, 1024)
> > >   1347 9526789 [sig] sh 746 proc_subproc: clear waiting threads
> > >   2084 15226432 [main] mintty 744 
> > > fhandler_pty_master::process_slave_output: returning 256
> > >   1110 6701404 [main] tee 748 fhandler_pty_slave::write: (1267): pty 
> > > output_mutex (0x4C0): waiti
> > > -1 ms
> > >    732 9527521 [sig] sh 746 checkstate: child_procs count 2
> > >   3648 10696945 [main] timeout 745 open_shared: name cygpid.724, shared 
> > > 0x1A0050000 (wanted 0x1A
> > > 0000), h 0x16C, m 6, created 0
> > >   1055 6702459 [main] tee 748 fhandler_pty_slave::write: (1267): pty 
> > > output_mutex: acquired
> > > 
> > >   2092 15753306 [main] mintty 744 fhandler_pty_master::close: (2095): 
> > > pty output_mutex (0x4AC):
> > > ting -1 ms
> > > 
> > > And below is a location where `tee' did hang.
> > > 
> > > #3  0x00007ffd0e408fdf in fhandler_pty_slave::write (this=0x800009a10,
> > >      ptr=0x7ffffc780, len=<optimized out>)
> > >      at ../../.././winsup/cygwin/fhandler/pty.cc:1268
> > > 1268      if (!process_opost_output (get_output_handle (), ptr, towrite, 
> > > false,
> > > (gdb)  li
> > > 1263      termios_printf ("pty%d, write(%p, %lu)", get_minor (), ptr, len);
> > > 1264
> > > 1265      push_process_state process_state (PID_TTYOU);
> > > 1266
> > > 1267      acquire_output_mutex (mutex_timeout);
> > > 1268      if (!process_opost_output (get_output_handle (), ptr, towrite, 
> > > false,
> > > 1269                                 get_ttyp (), is_nonblocking ()))
> > > 
> > > 
> > > I ended up prepending two CancelIo call just above of 
> > > acquire_output_mutex located in fhandler_pty_master::close of pty.cc.
> > > 
> > >  >	  CancelIo(get_ttyp()->to_master());
> > >  >	  CancelIo(get_ttyp()->to_master_nat());
> > > 	  acquire_output_mutex (mutex_timeout);
> > > 
> > > Hope it helps
> > 
> > I cannot reproduce the issue.
> > 
> > 1) Which cygwin version do you use?
> > 2) Is this really the same problem with the problem original post reported?
> >    (i.e. reproducible with 3.6.0-0.139 and not reproduced with 3.5.3)
> 
> I could reproduce this (the problem reported by jojelino) by:
> mintty -e timeout 1 dash -c 'yes aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | cat'
> 
> This also occurs with cygwin 3.5.3, so it's another problem than Brian reported.
> 
> This does not occur by:
> mintty -e timeout 1 dash -c 'yes aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
> mintty -e timeout 1 tcsh -c 'yes aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | cat'
> script /dev/null -c timeout 1 dash -c 'yes aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa | cat'
> 
> I looked into this probelm and found the cause.
> 
> When the child process (timeout) is terminated, mintty seems to stop reading
> pty master even if yes or cat still alive. (Is this right? >> Thomas)
> (This behaviour seems to be a side efect of patch_319(?) in mintty.)
> 
> If the the pipe used to transfer output from pty slave to pty master is full
> due to lack of master reader, WriteFile() to the pipe is blocked. WriteFile()
> cannot be canceled by cygwin signal, therefore, pty slave hangs.
> 
> I'll submit and push a patch to fix this.
> 
> Thanks for finding this issue. >> jojelino

Please test cygwin-3.6.0-0.145.gc4fb5da27876

Cygwin 3.5.4 will come with this fix.

-- 
Takashi Yano <takashi.yano@nifty.ne.jp>


More information about the Cygwin mailing list