This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Possible race in SYSV IPC (semaphores)


Hi,

For now, I can only report the observed (mis) behavior of SYSV semop() call,
which (on the client side) gets manifested as the following:

transport_layer_pipes::connect: lost connection to cygserver, error = 2

(this code then does a by-hand adjustment with semctl(SETVAL)).

Note that there is a dedicated cygserver process running for my single-threaded
application.

Looking at the debugging output of cygserver, this is what I see in the log
(around the only time semctl() is logged there):

cygserver: /home/corinna/src/cygwin/cygwin-1.7.15/cygwin-1.7.15-1/src/cygwin-1.7.15/winsup/cygserver/transport_pipes.cc, line 132: Try to create named pipe: \\.\pipe\cygwin-13a7ed34cc1953a9-lpc
cygserver: /home/corinna/src/cygwin/cygwin-1.7.15/cygwin-1.7.15-1/src/cygwin-1.7.15/winsup/cygserver/transport_pipes.cc, line 132: Try to create named pipe: \\.\pipe\cygwin-13a7ed34cc1953a9-lpc

Note the double pipe creation call, and only a single "exit" log line such as:

cygserver: /home/corinna/src/cygwin/cygwin-1.7.15/cygwin-1.7.15-1/src/cygwin-1.7.15/winsup/cygserver/sem.cc, line 81: leaving (3416)

Cygserver does not stop (also, since SIGSYS is set to ignore in the program,
it also keeps running -- although, not always quite successfully once the semop()
failure occurred.)

The semaphore operations are very intensive; and involve arrays of 5 sems at
some times;  also, there are quite large chunks of shmem updated every now
and then.

I studied the source of cygserver, and noticed that pipe_instance (transport_pipes.cc)
is not declared "volatile".  This is strange because the compiler can rearrange lines
of code that include this variable, otherwise.  And that seems rather critical.

Right now what I observe, is that SYSV IPC is unreliable, and I'm yet to figure
out why;  the very same code (the locking logic) works on Linux/Solaris/Mac for
years and on thousands (yes, that many) of hosts.  With CYGWIN the instability can
appear within a wide range of run time: from just a few minutes to some long hours,
rather randomly.

Any input can be greatly appreciated.

Anton Lavrentiev
Contractor NIH/NLM/NCBI


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]