Re: connect() hangs on a listen()ing AF_UNIX socket

Corinna Vinschen wrote:
On Aug 21 21:14, Christian Franke wrote:
Complex but may work: A fhandler_socket::listen() on a AF_UNIX/SOCK_STREAM
socket starts a thread which accept()s connections, performs the handshake
and puts the new socket descs in a queue. fhandler_socket::accept4() then no
longer calls accept() but waits for the next entry in the queue.
Yeah, that might be very tricky, especially if the executable forks and
execs after calling listen.

Which would require to pass an accept()ed handle from parent to (grand)child. Let's forget this option for now.

The problem is that the package exchange at the start of an
accept/connect is required to be able to exchange credentials.  This in
turn is required for getpeereid and the SO_PEERCRED socket option which
is utilized at least by sshd.
Easier and may work for Postfix: Add a Cygwin specific socket option like
SO_DONT_NEED_PEERCRED which is set immediately after Postfix calls
socket(AF_UNIX, SOCK_STREAM). If set, no handshake occurs on
connect()/accept(). getpeerid()/SO_PEERCRED should fail then.
Well, it's not *only* SO_PEERCRED.  Another, the older part of the
handshake, is about recognizing the peer.  Since AF_UNIX sockets don't
exist on Windows, Cygwin is using AF_INET sockets under the hood, and
so *any* Windows process could accidentally connect to a Cygwin AF_UNIX
socket.  The handshake also aims to avoid this scenario.  Only if the
handshake worked, the peers can be sure to talk to another Cygwin
process assuming an AF_UNIX socket.

A Cygwin-specific socket option which switches off the handshake would
disallow this peer recognition.  How bad is that?  I'm not sure.

Good question.

Another potential solution might be to defer the AF_UNIX handshake to
the first send/recv:

Whatever the peers do, there is a certain protocol used.  That means,
there's an implicit understanding who's going to do the first send and
who's doing the first recv.  So, after connect/accept, both sides of the
sockets go into "connected_but_handshake_missing" mode.  On the first
send/recv, the handshake gets started and if it fails, send/recv

Is an actual handshake really required? It would possibly be sufficient that each peer sends its secret+credential and then expects a correct secret+credential from the other peer before sending anything.

After actual connect()/accept():

send our secret+cred (should not block due to TCP queuing).
if (! nonblocking recv peer secret+cred)

Before actual send()/recv()/getpeerid():

if (state == connected_but_secret_missing) {
  if (! recv peer secret+cred)

AFAICS this should provide the behavior required for postfix: client connect() succeeds before server accept(). It adds the following unusual behavior: client send() and getpeereid() wait for server accept().


