This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Postgres Backend doesn't catch the next command, after SIGUSR2


On Mar 29 04:14, Patrick Samson wrote:
> ! The explanation is spotted in net.cc !
> 
> in wsock_event::wait()
> 
>       case WSA_WAIT_EVENT_0 + 1:
> 	if (!CancelIo ((HANDLE) socket))
> 	  {
> 	    debug_printf ("CancelIo() %E, fallback to
> blocking io");
> 	    WSAGetOverlappedResult (socket, &ovr, &len, TRUE,
> flags);
> 	  }
> 	else
> 	  WSASetLastError (WSAEINTR);
> 	break;
> 
> Most of the time, when signal_arrived is raised,
> there is nothing but the EINTR code to set, and
> the backend loops in recv() to receive the next
> command.
> But the race conditions may be different, and
> the command is available at the same time the
> signal is detected.
> So the CancelIo() call discards the command.
> When the backend returns in recv(), the command
> is lost, and the sender waits for an answer
> -> deadlock
> 
> Why this CancelIo() ??
> It seems too intrusive.

When WSAWaitForMultipleEvents returns WSA_WAIT_EVENT_0 + 1, you can be
sure that the event hasn't happen at this point.  Otherwise it would
have returned WSA_WAIT_EVENT_0.  Unfortunately this doesn't mean that
the event couldn't happen a nanosecond later.

If the signal has arrived and the WSARecvFrom call should be interrupted,
you can't just go ahead, since the call to WSARecvFrom got a pointer
to application allocated memory.  You can't rely on the fact that the
application will keep this memory intact after recvfrom returned with
EINTR.  If you do, Windows might scramble application memory.  To avoid
that, the CancelIo cancels the active call.

Having said that, does the below change at least alleviates the problem?

The implementation would have to be changed a bit more to get this
entirely non-racy, though.

> Additional note:
> DWORD len;
> is present in case WSA_WAIT_EVENT_0
> but is missing in case WSA_WAIT_EVENT_0 + 1

Thanks for catching this.  I've applied a patch.

Corinna

Index: net.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/net.cc,v
retrieving revision 1.162
diff -u -p -r1.162 net.cc
--- net.cc	29 Mar 2004 14:08:44 -0000	1.162
+++ net.cc	29 Mar 2004 14:09:17 -0000
@@ -83,7 +83,9 @@ wsock_event::wait (int socket, LPDWORD f
 	  ret = (int) len;
 	break;
       case WSA_WAIT_EVENT_0 + 1:
-	if (!CancelIo ((HANDLE) socket))
+	if (WSAGetOverlappedResult (socket, &ovr, &len, FALSE, flags))
+	  ret = (int) len;
+	else if (!CancelIo ((HANDLE) socket))
 	  {
 	    debug_printf ("CancelIo() %E, fallback to blocking io");
 	    WSAGetOverlappedResult (socket, &ovr, &len, TRUE, flags);


-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Developer                                mailto:cygwin@cygwin.com
Red Hat, Inc.

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]