UNIX domain socket patch

Fri Jul 26 17:25:00 GMT 2002

This is a rather long email, so I've made out like I was back in
ISO 9001 land and gone overboard with the "change request".  Enjoy
:-)

** Summary:

This is a patch for a race in cygwin's UNIX domain socket
authentication protocol.  It also fixes a couple of other minor
problems with cygwin's UNIX domain socket implementation.

This patch has been tested on both win2k and win98SE (but not on
winsock 1).  It has been tested with blocking and non-blocking
socket calls (and in the latter case both with polling for
connections and with select(2) calls).  It has also been tested
with a threaded client.

** Other fixes:

This patch also fixes:

 * getpeername(2) for UNIX domain sockets (currently this just
returns the information for the underlying INET socket).

 * with a non-blocking socket, a server couldn't poll a UNIX
domain socket by calling accept(2) repeatedly.

 * a socket could be created with one address family and then
connected/bound to a different one.

 * there was no checking of the UNIX domain socket file, so you
could simply create a standard file with the relevant data in it
and the code would not notice.

** Known issues in this patch:

One issue with the new protocol is that the client cannot close
its connection until the server has performed its half of the
protocol.  This can only be an issue in the following situation:

 * the client just writes to the connection (as a read would block
anyhow)

 * the server is *really* slow at accepting connections (or is
hung)

If the close is called before the client exits, it can be
interrupted (by a signal).  Otherwise, the solution is to kill the
server (or kill the client from the Windows Task Manager).

A second issue is that there is no code to handle multi-threading,
which could be a problem if a socket is shared between threads.
This was a problem in the previous version of the code as well.
I'll submit another patch to fix this.

** Problem with previous protocol:

The original authentication protocol relies on both client and
server creating a win32 event object with a "secret" name.  The
secret part of the name is a random key that is stored in the UNIX
domain socket file and is thus only accessible to whoever can read
that file.  Any process (client or server) trying to use the
underlying socket without knowledge of the secret key is prevented
from receiving connections.

The handshake is that on accept or connect, a process sets its own
secret event object and waits on the peer process's secret event
object.  The race is then as follows.  If two clients attempt to
connect to the same server, their two requests will be placed on
the server's connection queue and their connect requests will
succeed.  They will then both signal their own secret events and
wait on the server's secret event.  When the accept request
succeeds in the server with one or other of these two requests,
the server will signal its own secret event and wait on the
relevant peer process's secret event.  This wait in the server
will succeed immediately (since the clients signal their own
events as soon as the connect succeeds for them).  The problem is
that both the clients are waiting on the one server secret event
object, and it is possible for the "wrong" client to wake up.  At
this point, the client whose connection has been accepted is still
blocked on the server's secret event while the client whose
connection is still pending, carries on as if it had been
authenticated.  If the server now tries to read some data from the
client, it will block as the client itself is blocked.

In testing cygserver with UNIX domain sockets, I was getting such
blocks frequently.  At each block, there is a ten second delay (as
the protocol times out) and that client request fails.

There is also a problem with the protocol in that for non-blocking
connections, where the client calls connect(2) and then waits in
select(2) until the connection is signalled, it never performs its
half of the handshake and so could connect to unauthorized
servers.

** New protocol:

The new protocol is very similar to the original one, using secret
objects with the same names as before, except that now the objects
are semaphores rather than event objects and the processes do
authentication by checking for the existence of the semaphore
rather than by waiting on it.

In detail, both client and server create their semaphore before
attempting to connect.  In the client's case this implies that it
needs to explicitly bind(2) to a system-provided INET address as
the port number is part of the secret event name (this binding
would otherwise be done implicitly by the call to connect(2) so
this doesn't change the behaviour of the application).  In the
client, when the connect(2) call succeeds it merely needs to check
that the server's secret event exists: if so, it succeeds;
otherwise it resets the connection and fails.

The logic is fundamentally the same in the server (when the
accept(2) call succeeds, just check for the existence of the
client secret event) except for one annoying twist.  It is
possible for a client that doesn't read from the socket to connect
to a server, write some data and close the socket *before* the
server's accept(2) call returns that connection (i.e. the whole
thing happens with the client connection sitting on the server's
pending connection queue).  In this case, the server would reject
the connection since the client could have closed its secret event
too soon.  Thus for this one problem, the server does release the
client's semaphore (as a "I've seen your secret event" signal) and
the client waits on this signal in its close(2) code.

For simplicity in the code, when a secret event is duplicated (by
dup(2) or fork(2) for example) the secret event is also signalled
(in both the server and the client).  The server also signals its
secret event as soon as it creates it.  This means that in a
server, the release count on the semaphore equals the number of
handles there are to that socket (and thus to that event), while
in the client the release count is one lower than the handle count
and so it blocks in the last close, until the server signals it
too that is.

** Grovelling conclusion:

I think this patch is really groovy and I've just spent the last
week and a half sweating over it, so please apply it :-)

// Conrad

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ChangeLog.txt
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20020726/30627a5e/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: af_local.patch.bz2
Type: application/octet-stream
Size: 7335 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-patches/attachments/20020726/30627a5e/attachment.obj>