[RFC] Fix expect leaky pipes

Dave Korn dave.korn@artimi.com
Sun May 28 23:00:00 GMT 2006


    Hi everyone,


  I've been looking at the problem with expect leaking pipe handles and
running dog-slow as its handle table fills up (or on my old w2k box, causing
a complete kernel lock-up), and I *think* I've got it.  After adding a whole
load of termios_printfs to track everything that was going on, it seems like
there's just an ordering problem in the shutdown of the final master.  In
fhandler_pty_master::close we check whether the master is still alive, and
if so, only call the fhandler_tty_common::close routine, but if the master
is no longer alive (and I /think/ that means the same as "if we're closing
the final handle to it"), we close the from/to master pipes as well.

  Now, in expect what happens is that when the child processes have all
exited and closed their last open handles to the tty, the expect master
tries to close its fd to the pty master.  On entry the master is still
alive (signified by the respective event existing) but when we call the
common::close handler it closes the last open handle to the 'master alive'
event which then goes away.  So we've closed the last fd to it but haven't
closed the from/to master pipes, and that's how they leak.  I've fixed the
problem by re-checking after calling out to the common::close and seeing if
the master is *now* dead, and closing the master pipes at that time.

  Anyway, this isn't quite right, but I need to ask some advice: initially,
we had (modulo #if-0 and whitespace)


int
fhandler_pty_master::close ()
{
  if (get_ttyp ()->master_alive ())
    fhandler_tty_common::close ();
  else
    {
      termios_printf ("freeing tty%d (%d)", get_unit (), get_ttyp ()->ntty);
      if (get_ttyp ()->from_master)
	CloseHandle (get_ttyp ()->from_master);
      if (get_ttyp ()->to_master)
	CloseHandle (get_ttyp ()->to_master);
      fhandler_tty_common::close ();
      if (!hExeced)
	get_ttyp ()->init ();
    }
  return 0;
}

and I've changed that to

int
fhandler_pty_master::close ()
{
  if (get_ttyp ()->master_alive ())
    fhandler_tty_common::close ();
  /* That could have closed the last master, so re-check.  */
  if (!get_ttyp ()->master_alive ())
    {
      termios_printf ("freeing tty%d (%d)", get_unit (), get_ttyp ()->ntty);
      if (get_ttyp ()->from_master)
	CloseHandle (get_ttyp ()->from_master);
      if (get_ttyp ()->to_master)
	CloseHandle (get_ttyp ()->to_master);
      fhandler_tty_common::close ();
      if (!hExeced)
	get_ttyp ()->init ();
    }
  return 0;
}

which preserves the existing behaviour if !master_alive on entry, but closes
the master's pipes if it was alive but becomes dead on calling the
common::close handler.  However it then still goes and executes the second
call to common::close and generates a lot of noise in the strace logs trying
to reclose already closed handles.

  What I'm wondering is if there is any significance in closing the master
pipes before it calls common::close in the else leg of the if-else.  If it
was safe to move that call to before the two CloseHandles, then we could
just common it out and call it unconditionally at the start of the function
and only then check the master alive status.  This would be different
behaviour, but it could still be valid as far as I know.  If that's going to
be OK, I'd rather do it that way round; otherwise I'll just add a flag to
avoid calling common::close twice in one call to the function.

  I haven't been looking at this code long, so I don't know if I grokked it
right yet.  What would it mean if the master was /not/ alive at the time
we're calling close with an open fd to it?  Is that even possible?  If we
have it open wouldn't that necessarily imply we had a handle on the 'master
alive' event in any case?  What really is the meaning of alive or dead here?

  While I was at it, I restored the bit that cgf #if-0'd out a while ago; it
seems not to be causing any noticeable problems so far.  Chris, can you
enlighten me a bit about what was going wrong there?

  Soon as I've got a better idea which is the proper approach to take (use
a flag vs. unify the two common::close calls), I will submit a proper patch.
Meantime, this has been the last thing that was blocking progress on the
new gcc release; I didn't want to release it untested.  That problem looks
like it's solved now, so I'll be able to progress the release RSN!


    cheers,
      DaveK
-- 
Can't think of a witty .sigline today....
-------------- next part --------------
A non-text attachment was scrubbed...
Name: leaky-pipe-patch.diff
Type: application/octet-stream
Size: 1072 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-developers/attachments/20060528/2a4f1faf/attachment.obj>


More information about the Cygwin-developers mailing list