Sv: ECONNABORTED and ECONNRESET on TCP socket using recv()

sten.kristian.ivarsson@gmail.com sten.kristian.ivarsson@gmail.com
Fri May 22 13:26:48 GMT 2020


> > > Hi all
> > >
> > > Have anyone experienced getting ECONNABORTED and ECONNRESET on local
> > > TCP socket when using recv() ?
> > >
> > >
> > > We have a fairly complex application where it, amongst others,
> > > spawns child processes (using posix_spawnp)
> > >
> > > This is a simplified scenario
> > >
> > > - parent performs socket() + bind() + listen() to localhost
> > > - parent spawns a client-child process
> > >   - client-child is doing socket() + connect() to localhost
> > >   - client-child is doing send()
> > >   - client-child is doing recv() and getting ECONNRESET
> > >
> > > - parent performs accept()
> > > - parent spawns a server-child process
> > >   - server-child is doing recv() and getting ECONNABORTED
> > >
> > >
> > > According to strace, both of these errors originates from
> > > fhandler_socket_inet::recv_internal() (in my version it says line
> > > 1221)
> >
> > The errors are generated by the called Windows function WSARecvFrom.
> > We'd need a reproducible testcase for this to allow debugging.
> 
> The application is quite complex but I guess it won't count as a test-case
> and we still fail to reproduce this in a simple manner
> 
> 
> Looking at strace along with winsock-trace revealed a few mysterious
> though.
> According to the strace there's a fork for every posix_spawnp, i.e. it
> seems like two processes are created (the forked later exits) but they are
> somehow tied to the same cygwin-pid. The weird thing is that one of the
> forked "ghost-processes" gets a winsock-abort-event, so my take on this is
> that the
> dup(lications) of socket-descriptors kind of transforms the ownership to
> the wrong process or perhaps there's some premature release or such. The
> "ghost-process" getting the winsock-abort-event are of a type that should
> "inherit" the accept-socket and is called "client-child" in the
> description above

[snip]

We discovered this to be a defect in our own code due to the fact that some
parts assumed that struct linger always had two int's (but in CygWin it is
two short's

This was discovered due to a strace-debug-output 

  if (optlen == (socklen_t) sizeof (int))
    debug_printf ("setsockopt optval=%x", *(int *) optval);

in fhandler_socket_inet::setsockopt that in it self is kind of weird, i.e.
it seems like it assumes an int is passed just because optlen is of the same
size as an int (and struct linger happen to be just that, so ... it kind of
helped us :-)

Keep up the good work
Kristian

[snip] 



More information about the Cygwin mailing list