This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: Spurious EWOULDBLOCKs on NT4.0

To: cygwin-developers at cygwin dot com
Subject: Re: Spurious EWOULDBLOCKs on NT4.0
From: Christopher Faylor <cgf at redhat dot com>
Date: Tue, 30 Oct 2001 13:22:29 -0500
References: <20011030003901.A8989@redhat.com> <20011030124951.A891@cygbert.vinschen.de>
Reply-To: cygwin-developers at cygwin dot com

On Tue, Oct 30, 2001 at 12:49:51PM +0100, Corinna Vinschen wrote:
>On Tue, Oct 30, 2001 at 12:39:01AM -0500, Christopher Faylor wrote:
>> I thought that I would try to duplicate the ftpd/rsync problems on
>> my laptop under NT4.0.  However, I'm now noticing that I'm getting
>> occasional EWOULDBLOCK/EAGAIN errors from ssh.  I repeatedly get
>> them when copying a 1K+ file, on close.  Not every time, though.
>> 
>> The actual winsock error is WSAEWOULDBLOCK.
>> 
>> I think that this is due to the relatively "recent" addition of the
>> setsockopt(...LINGER...) in fhandler_socket::close.  This is borne out
>> by the fact that I can't duplicate the problem in 1.3.2.
>> 
>> The code below seems to "fix" this problem but I'm not sure it is
>> correct.  According to SUSv2, close is supposed to be a quick operation
>> unless SO_LINGER is used...
>> 
>> Hmm.  In fact, it must be wrong since this could make close block
>> indefinitely.
>> 
>> Another way of doing this is to ignore WSAEWOULDBLOCK errors in
>> fhandler_socket::close.  Maybe that's more correct.
>> 
>> Anyone know what's happening here?  Corinna?
>
>I think, yes.  WSAEWOULDBLOCK is raised by closesocket if the
>socket is nonblocking and SO_LINGER is set to a non-zero value.
>
>The below code is probably more correct than the previous one.
>Our problem is the combination of closesocket/exit which, when
>too fast, results in data loss on not already finished connections
>on process exit.  This sappy behaviour is what we have to avoid,
>therefore it's correct to wait unless close has _really_ finished
>it's job even on nonblocking sockets and even if that's "not
>recommended" by MSDN.  AFAIK, the below code can't loop forever
>since closesocket() will return either 0 or another error code
>at one point.  What I'm just missing now is the error handling after
>finishing the closesocket() loop.  I added that to the repository.

I'm not sure that my solution is correct, though (and, in fact, I checked
it in by mistake).

The linux documentation says this:

  When  enabled,  a  close(2) or shutdown(2) will not return until all queued messages for the socket have
  been successfully sent or the linger timeout has been reached. Otherwise, the call  returns  immediately
  and  the  closing  is  done  in the background.  When the socket is closed as part of exit(2), it always
  lingers in the background.

That sounds like we shouldn't be blocking on close, but should just
be ignoring the EAGAIN.

I'm going to make that change in a few minutes.

cgf

Follow-Ups:
- Re: Spurious EWOULDBLOCKs on NT4.0
  - From: Corinna Vinschen

References:
- Spurious EWOULDBLOCKs on NT4.0
  - From: Christopher Faylor
- Re: Spurious EWOULDBLOCKs on NT4.0
  - From: Corinna Vinschen

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]