This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: PLEASE TEST: New implementation of blocking socket I/O


At 05:36 PM 3/31/2004 +0200, Corinna Vinschen wrote:
>On Mar 31 09:55, Pierre A. Humblet wrote:
>> I am hesitant to start sinking time in this. If I understand correctly you
>> are trying to fix a race. They tend to go away while running under gdb.
>
>The race is gone with this code.  It's no the problem to debug it
>with gdb.
>
>> In 310663 = write (3, 0x7DE810, 1), it's likely that 310663 is larger
>> than the buffer size. Where can that come from? Can you add a trap?
>
>There's a chance that the first call to WSAwhatever already returns
>WSAEWOULDBLOCK and ret is uninitialized.  I've applied a fix which
>always set ret to 0 in calls to WSAwhatever.  Does that help?

It makes a difference, but I am not sure why, it shouldn't be necessary
AFAICS. At least I more or less consistently saw:

  264 5385988 [main] cvs 91960353 __set_errno: void
__set_winsock_errno(const char*, int):366 val 1
  154 5386142 [main] cvs 91960353 __set_winsock_errno: sendmsg:1100 -
winsock error 6 -> errno 1

When stepping through the code I noticed that wsock_event::release
was calling WSACloseEvent. Later the wsock_event destructor was calling
WSACloseEvent again, setting the wsock errno. So I fixed that, moving both
wsock_event::release and the destructor out of the way (in a crude way),
so as not to affect the wsock errno.

Then I started seeing

  279 3295264 [main] cvs 295057 writev: 1024 = write (3, 0x7DE7E0, 1), errno 2
  344 3295608 [main] cvs 295057 sig_dispatch_pending: exit_state 0, cur
thread id 0xFFF0A123, sigtid 0xFFEF130B, sigq.start.next 0x0
  188 3295796 [main] cvs 295057 writev: writev (3, 0x7DE810, 1)
  433 3296229 [main] cvs 295057 __set_errno: void __set_winsock_errno(const
char*, int):367 val 0
  209 3296438 [main] cvs 295057 __set_winsock_errno: sendmsg:1102 - winsock
error 0 -> errno 0

i.e. it fails but winsock_errno is 0. I made a couple more changes in
wsock_event::wait where the code could wipe out the wsock_errno (see patches).

No luck, wsock errno is still 0, but sendmsg surely fails.

When I put a breakpoint in the wait loop, nothing bad happens.
Debugging non-blocking and wait stuff with gdb is a long shot.

And yes, from time to time it does like in my first report, hanging
  158 2552618 [main] cvs 540017 readv: readv (4, 0x7DEBC0, 1) blocking,
sigcatchers 6
  154 2552772 [main] cvs 540017 readv: no need to call ready_for_read

I have to stop for now, I need to sleep over this anyway.
 
Perhaps you would have more luck triggering this bug if
you tried a low speed (modem) connection.

Pierre

P.S.: The diff is FYI, not meant to be permanent.  

Attachment: sock.diff
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]