This is the mail archive of the cygwin@sources.redhat.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: socket read()/write() problem in 1.1.6


Hi,

Corinna Vinschen writes:
 > Markus Hoenicka wrote:
 > > 
 > > Hi,
 > > 
 > > I stumbled over a problem with socket reading/writing after I upgraded
 > > from cygwin1.dll version 1.1.4 to 1.1.6
 > > [...]
 > 
 > Did you strace the server and/or client on Cygwin? I wonder what's
 > going on there. If this would be a 100% reproducable problem in
 > Cygwin, all my ssh connections all day would fail, so...
 > 
I'm aware that if read() or write() were plain buggy the list would be
full of complaints. However I see the problem on two Cygwin
installations and I *always* see it, so its not some random thing. I
have had some dangling pointers and similar problems, but all these
seem to be fixed now. They would crash the program immediately on both
Linux and Cygwin. The current problem never crashes the app, it just
makes it do the wrong thing consistently with a newer Cygwin DLL. This
certainly does not mean that something with Cygwin read() or write()
must be wrong. It could as well be some small memory corruption that
happens to be without effect on Linux and with older Cygwin versions.

I tested several Cygwin DLLs. 1.1.3 and 1.1.4 *never* show the
problem. 1.1.5 and 1.1.6 *always* show the problem.

I can use my Cygwin client running with DLL 1.1.6 and connect to a
server running on Linux without problem. The other way round (client
on Linux and server on Cygwin DLL 1.1.6) does have the problem, so it
appears the problem is on the write() side of the socket.


 > A little bit of debugging would be gladly appreciated.
 > 
I'll do my best as I can't expect that someone else is debugging this
for me. I was just asking because this might ring some bell. I looked
at the strace output but I'm afraid that my knowledge is not
sufficient to make much sense of it (my degree is in Biology after
all, not CS). There are certainly some differences but I'd need
someone to look at that who knows a bit more about socket
communication in older and recent Cygwin DLLs. The relevant parts of
the server straces are appended below (yes I know nobody is begging to
read that, but anyway...). On the client side I basically
see what my debug printf()s also reported: After the first successful
read() there is a successful select() but the next read() fails with
__set_winsock_errno: read:119 - 10054 (WSAECONNRESET) -> 104

 > However, could you please provide the sources of two testapps (server,
 > client) which reproducable show the problem on your box?
 > 
Oh well. I'm looking at 700 kb of obfuscate sources which were not yet
cleaned up for release. I could just make a tarball of this. The
cleaner way would certainly be to cut it down to a minimal testcase,
but this would take quite some time. I'll try to come up with
something unless someone can tell from the strace what might be wrong.

Thanks for your help
Markus

Cygwin dll 1.1.4
---------------------------------------------------------------------
  420 26972362 [main] refdbs 1018 select_stuff::wait: n 1, ms 15000
13042 26985404 [select_socket] refdbs 1018 thread_socket: stuff_start 0xA05DE94
  346 26985750 [select_socket] refdbs 1018 thread_socket: Win32 select returned 1
  210 26985960 [select_socket] refdbs 1018 thread_socket: s 0xA053FE0, testing fd 5 (/dev/tcp)
  198 26986158 [select_socket] refdbs 1018 thread_socket: write_ready
  618 26986776 [main] refdbs 1018 select_stuff::wait: woke up.  wait_ret 1.  verifying
  213 26986989 [main] refdbs 1018 select_stuff::wait: gotone 1
  880 26987869 [main] refdbs 1018 peek_socket: considering handle 0x3C
  194 26988063 [main] refdbs 1018 peek_socket: adding write fd_set /dev/tcp, fd 5
  194 26988257 [main] refdbs 1018 peek_socket: adding except fd_set /dev/tcp, fd 5
  236 26988493 [main] refdbs 1018 peek_socket: WINSOCK_SELECT returned 1
  201 26988694 [main] refdbs 1018 set_bits: me 0xA053FE0, testing fd 5 (/dev/tcp)
  198 26988892 [main] refdbs 1018 set_bits: ready 1
  191 26989083 [main] refdbs 1018 select_stuff::poll: returning 1
  192 26989275 [main] refdbs 1018 select_stuff::~select_stuff: calling cleanup routines
  319 26989594 [main] refdbs 1018 socket_cleanup: si 0xA05AE70 si->thread 0xC0
  196 26989790 [main] refdbs 1018 socket_cleanup: connection to si->exitsock 0x48
 1617 26991407 [main] refdbs 1018 socket_cleanup: returning
  200 26991607 [main] refdbs 1018 select_stuff::~select_stuff: deleting select records
 1014 26992621 [main] refdbs 1018 _write: write (5, 0x245E980, 34)
44258 27036879 [main] refdbs 1018 _write: 34 = write (5, 0x245E980, 34)
  269 27037148 [main] refdbs 1018 cygwin_select: 6, 0x0, 0x245F938, 0x245F930, 0x245F928
 1027 27038175 [main] refdbs 1018 hinfo::select_write: /dev/tcp fd 5
  838 27039013 [main] refdbs 1018 hinfo::select_except: /dev/tcp fd 5
  191 27039204 [main] refdbs 1018 cygwin_select: to->tv_sec 15, to->tv_usec 0, ms 15000
  317 27039521 [main] refdbs 1018 cygwin_select: sel.always_ready 0
  374 27039895 [main] refdbs 1018 start_thread_socket: Handle 0x3C
  191 27040086 [main] refdbs 1018 start_thread_socket: Added to writefds
  192 27040278 [main] refdbs 1018 start_thread_socket: Added to exceptfds
  863 27041141 [main] refdbs 1018 start_thread_socket: exitsock 0x88
  214 27041355 [main] refdbs 1018 start_thread_socket: stuff_start 0x245F820
  424 27041779 [main] refdbs 1018 select_stuff::wait: n 1, ms 15000
41400 27083179 [select_socket] refdbs 1018 thread_socket: stuff_start 0xA05DE94
  339 27083518 [select_socket] refdbs 1018 thread_socket: Win32 select returned 1
  214 27083732 [select_socket] refdbs 1018 thread_socket: s 0xA051E60, testing fd 5 (/dev/tcp)
  200 27083932 [select_socket] refdbs 1018 thread_socket: write_ready
  554 27084486 [main] refdbs 1018 select_stuff::wait: woke up.  wait_ret 1.  verifying
  210 27084696 [main] refdbs 1018 select_stuff::wait: gotone 1
  849 27085545 [main] refdbs 1018 peek_socket: considering handle 0x3C
  193 27085738 [main] refdbs 1018 peek_socket: adding write fd_set /dev/tcp, fd 5
  193 27085931 [main] refdbs 1018 peek_socket: adding except fd_set /dev/tcp, fd 5
  234 27086165 [main] refdbs 1018 peek_socket: WINSOCK_SELECT returned 1
  202 27086367 [main] refdbs 1018 set_bits: me 0xA051E60, testing fd 5 (/dev/tcp)
  195 27086562 [main] refdbs 1018 set_bits: ready 1
  191 27086753 [main] refdbs 1018 select_stuff::poll: returning 1
  192 27086945 [main] refdbs 1018 select_stuff::~select_stuff: calling cleanup routines
  196 27087141 [main] refdbs 1018 socket_cleanup: si 0xA05AE70 si->thread 0xAC
  193 27087334 [main] refdbs 1018 socket_cleanup: connection to si->exitsock 0x88
 1647 27088981 [main] refdbs 1018 socket_cleanup: returning
  198 27089179 [main] refdbs 1018 select_stuff::~select_stuff: deleting select records
 1742 27090921 [main] refdbs 1018 _write: write (5, 0xA0516C0, 1)
  228 27091149 [main] refdbs 1018 _write: 1 = write (5, 0xA0516C0, 1)
  848 27091997 [main] refdbs 1018 _close: close (5)
78812 27170809 [main] refdbs 1018 _close: 0 = close (5)
  264 27171073 [main] refdbs 1018 do_exit: do_exit (0)
  211 27171284 [main] refdbs 1018 void: 0x1 = signal (20, 0x1)
  197 27171481 [main] refdbs 1018 void: 0x0 = signal (1, 0x1)
  192 27171673 [main] refdbs 1018 void: 0x0 = signal (2, 0x1)
  194 27171867 [main] refdbs 1018 void: 0x0 = signal (3, 0x1)
  836 27172703 [main] refdbs 1018 _close: close (0)
 1210 27173913 [main] refdbs 1018 fhandler_base::set_io_handle: set handle to 0xFFFFFFFF
 1648 27175561 [main] refdbs 1018 _close: 0 = close (0)
  833 27176394 [main] refdbs 1018 _close: close (1)
 1163 27177557 [main] refdbs 1018 fhandler_base::close: handle 0x2C
 1653 27179210 [main] refdbs 1018 _close: 0 = close (1)
31997 27211207 [main] refdbs 1018 _close: close (2)
 1219 27212426 [main] refdbs 1018 fhandler_base::set_io_handle: set handle to 0xFFFFFFFF
 1649 27214075 [main] refdbs 1018 _close: 0 = close (2)
  833 27214908 [main] refdbs 1018 _close: close (3)
 1149 27216057 [main] refdbs 1018 fhandler_base::close: handle 0x70
 1676 27217733 [main] refdbs 1018 _close: 0 = close (3)
  832 27218565 [main] refdbs 1018 _close: close (4)
 3050 27221615 [main] refdbs 1018 _close: 0 = close (4)
17524 27239139 [main] refdbs 1018 proc_terminate: nchildren 0, nzombies 0
  197 27239336 [main] refdbs 1018 proc_terminate: leaving
  192 27239528 [main] refdbs 1018 sigproc_terminate: entering
  381 27239909 [main] refdbs 1018 sigproc_terminate: done
  240 27240149 [main] refdbs 1018 __to_clock_t: dwHighDateTime 0, dwLowDateTime 701008
  199 27240348 [main] refdbs 1018 __to_clock_t: total 00000000 00000046
  198 27240546 [main] refdbs 1018 __to_clock_t: dwHighDateTime 0, dwLowDateTime 500720
  195 27240741 [main] refdbs 1018 __to_clock_t: total 00000000 00000032
  224 27240965 [main] refdbs 1018 my_parent_is_alive: parent still alive
  205 27241170 [main] refdbs 1018 do_exit: calling ExitProcess 0
305881 27250913 [proc] refdbs 1014 proc_subproc: args: 3, 0
  299 27251212 [proc] refdbs 1014 proc_subproc: pid 1018[0] terminated, handle 0x104, nchildren 1, nzombies 0
  206 27251418 [proc] refdbs 1014 remove_child: removing [0], pid 1018, handle 0x104, nchildren 1
  202 27251620 [proc] refdbs 1014 sig_send: pid 1014, signal 20, its_me 1
  202 27251822 [proc] refdbs 1014 sig_send: ReleaseSemaphore succeeded
  195 27252017 [proc] refdbs 1014 sig_send: Not waiting for sigcomplete.  its_me 1 sig 20
  197 27252214 [proc] refdbs 1014 sig_send: returning 0 from sending signal 20
  196 27252410 [proc] refdbs 1014 proc_subproc: returning 0
  193 27252603 [proc] refdbs 1014 wait_subproc: looping
  216 27252819 [sig] refdbs 1014 wait_sig: awake
  208 27253027 [sig] refdbs 1014 wait_sig: processing signal 20
  195 27253222 [sig] refdbs 1014 wait_sig: Got signal 20
  192 27253414 [sig] refdbs 1014 sig_handle: signal 20
  192 27253606 [sig] refdbs 1014 sig_handle: signal 20 ignored
  192 27253798 [sig] refdbs 1014 sig_handle: returning 0
  195 27253993 [sig] refdbs 1014 proc_subproc: args: 4, 0
  193 27254186 [sig] refdbs 1014 proc_subproc: looking for processes to reap
  195 27254381 [sig] refdbs 1014 proc_subproc: finished processing terminated/stopped child
  197 27254578 [sig] refdbs 1014 proc_subproc: returning 1
  193 27254771 [sig] refdbs 1014 wait_sig: looping
---------------------------------------------------------------------

Cygwin dll 1.1.6
---------------------------------------------------------------------
  432  114845 [main] refdbs 199 select_stuff::wait: m 2, ms 15000
26255  141100 [select_socket] refdbs 199 thread_socket: stuff_start 0xA01CBC4
 2019  143119 [select_socket] refdbs 199 thread_socket: Win32 select returned 1
 1994  145113 [select_socket] refdbs 199 thread_socket: s 0xA013D18, testing fd 5 (/dev/tcp)
  658  145771 [select_socket] refdbs 199 thread_socket: write_ready
  987  146758 [main] refdbs 199 select_stuff::wait: woke up.  wait_ret 1.  verifying
 1817  148575 [main] refdbs 199 select_stuff::wait: gotone 1
  215  148790 [main] refdbs 199 select_stuff::wait: returning 0
  235  149025 [main] refdbs 199 peek_socket: considering handle 0x10
  205  149230 [main] refdbs 199 peek_socket: adding write fd_set /dev/tcp, fd 5
  198  149428 [main] refdbs 199 peek_socket: adding except fd_set /dev/tcp, fd 5
  240  149668 [main] refdbs 199 peek_socket: WINSOCK_SELECT returned 1
  284  149952 [main] refdbs 199 set_bits: me 0xA013D18, testing fd 5 (/dev/tcp)
  201  150153 [main] refdbs 199 set_bits: ready 1
  192  150345 [main] refdbs 199 select_stuff::poll: returning 1
  195  150540 [main] refdbs 199 select_stuff::~select_stuff: calling cleanup routines
  195  150735 [main] refdbs 199 socket_cleanup: si 0xA019BA0 si->thread 0x70
  196  150931 [main] refdbs 199 socket_cleanup: connection to si->exitsock 0x60
 1711  152642 [main] refdbs 199 socket_cleanup: returning
  196  152838 [main] refdbs 199 select_stuff::~select_stuff: deleting select records
  367  153205 [main] refdbs 199 _write: write (5, 0x245E984, 32)
50066  203271 [main] refdbs 199 _write: 32 = write (5, 0x245E984, 32)
  254  203525 [main] refdbs 199 cygwin_select: 6, 0x0, 0x245F93C, 0x245F934, 0x245F92C
  384  203909 [main] refdbs 199 dtable::select_write: /dev/tcp fd 5
  195  204104 [main] refdbs 199 dtable::select_except: /dev/tcp fd 5
  196  204300 [main] refdbs 199 cygwin_select: to->tv_sec 15, to->tv_usec 0, ms 15000
  197  204497 [main] refdbs 199 cygwin_select: sel.always_ready 0
  371  204868 [main] refdbs 199 start_thread_socket: Handle 0x10
  193  205061 [main] refdbs 199 start_thread_socket: Added to writefds
  194  205255 [main] refdbs 199 start_thread_socket: Added to exceptfds
  874  206129 [main] refdbs 199 start_thread_socket: exitsock 0x110
  218  206347 [main] refdbs 199 start_thread_socket: stuff_start 0x245F82C
  417  206764 [main] refdbs 199 select_stuff::wait: m 2, ms 15000
32616  239380 [select_socket] refdbs 199 thread_socket: stuff_start 0xA01CBC4
 1821  241201 [select_socket] refdbs 199 thread_socket: Win32 select returned 1
 2132  243333 [select_socket] refdbs 199 thread_socket: s 0xA011B98, testing fd 5 (/dev/tcp)
  661  243994 [select_socket] refdbs 199 thread_socket: write_ready
 1912  245906 [main] refdbs 199 select_stuff::wait: woke up.  wait_ret 1.  verifying
 1953  247859 [main] refdbs 199 select_stuff::wait: gotone 1
  217  248076 [main] refdbs 199 select_stuff::wait: returning 0
  204  248280 [main] refdbs 199 peek_socket: considering handle 0x10
  197  248477 [main] refdbs 199 peek_socket: adding write fd_set /dev/tcp, fd 5
  197  248674 [main] refdbs 199 peek_socket: adding except fd_set /dev/tcp, fd 5
  239  248913 [main] refdbs 199 peek_socket: WINSOCK_SELECT returned 1
  203  249116 [main] refdbs 199 set_bits: me 0xA011B98, testing fd 5 (/dev/tcp)
  198  249314 [main] refdbs 199 set_bits: ready 1
  192  249506 [main] refdbs 199 select_stuff::poll: returning 1
  197  249703 [main] refdbs 199 select_stuff::~select_stuff: calling cleanup routines
  272  249975 [main] refdbs 199 socket_cleanup: si 0xA019BA0 si->thread 0x38
  202  250177 [main] refdbs 199 socket_cleanup: connection to si->exitsock 0x110
 1660  251837 [main] refdbs 199 socket_cleanup: returning
  197  252034 [main] refdbs 199 select_stuff::~select_stuff: deleting select records
  368  252402 [main] refdbs 199 _write: write (5, 0xA017A18, 1)
  230  252632 [main] refdbs 199 _write: 1 = write (5, 0xA017A18, 1)
  884  253516 [main] refdbs 199 _close: close (5)
  280  253796 [main] refdbs 199 _close: 0 = close (5)
  237  254033 [main] refdbs 199 do_exit: do_exit (0)
  200  254233 [main] refdbs 199 void: 0x1 = signal (20, 0x1)
  196  254429 [main] refdbs 199 void: 0x0 = signal (1, 0x1)
  193  254622 [main] refdbs 199 void: 0x0 = signal (2, 0x1)
  195  254817 [main] refdbs 199 void: 0x0 = signal (3, 0x1)
  276  255093 [main] refdbs 199 fhandler_base::close: handle 0x14
  273  255366 [main] refdbs 199 fhandler_base::close: handle 0xB0
  297  255663 [main] refdbs 199 proc_terminate: nchildren 0, nzombies 0
  205  255868 [main] refdbs 199 proc_terminate: leaving
  192  256060 [main] refdbs 199 sigproc_terminate: entering
  199  256259 [main] refdbs 199 sigproc_terminate: done
  414  256673 [main] refdbs 199 __to_clock_t: dwHighDateTime 0, dwLowDateTime 400576
  199  256872 [main] refdbs 199 __to_clock_t: total 00000000 00000028
  200  257072 [main] refdbs 199 __to_clock_t: dwHighDateTime 0, dwLowDateTime 100144
  197  257269 [main] refdbs 199 __to_clock_t: total 00000000 0000000A
  197  257466 [main] refdbs 199 _pinfo::exit: Calling ExitProcess 0
240241 25334193 [proc] refdbs 288 proc_subproc: args: 3, 0
  269 25334462 [proc] refdbs 288 proc_subproc: pid 199[0] terminated, handle 0x130, nchildren 1, nzombies 0
  267 25334729 [proc] refdbs 288 proc_subproc: removing [0], pid 199, handle 0x130, nchildren 1
  213 25334942 [proc] refdbs 288 proc_subproc: returning 1
  199 25335141 [proc] refdbs 288 sig_send: pid 288, signal 20, its_me 1
  202 25335343 [proc] refdbs 288 sig_send: Not waiting for sigcomplete.  its_me 1 sig 20
  198 25335541 [proc] refdbs 288 sig_send: returning 0 from sending signal 20
  195 25335736 [proc] refdbs 288 wait_subproc: looping
507858 25843594 [sig] refdbs 288 wait_sig: awake
  611 25844205 [sig] refdbs 288 wait_sig: processing signal 20
  556 25844761 [sig] refdbs 288 wait_sig: Got signal 20
  884 25845645 [sig] refdbs 288 sig_handle: signal 20
  551 25846196 [sig] refdbs 288 sig_handle: signal 20 ignored
  557 25846753 [sig] refdbs 288 sig_handle: returning 0
  233 25846986 [sig] refdbs 288 proc_subproc: args: 4, 0
  196 25847182 [sig] refdbs 288 proc_subproc: looking for processes to reap
  197 25847379 [sig] refdbs 288 proc_subproc: finished processing terminated/stopped child
  199 25847578 [sig] refdbs 288 proc_subproc: returning 1
  193 25847771 [sig] refdbs 288 wait_sig: looping
3722442 29570213 [unknown (0x10F)] refdbs 288 _kill: kill (-288, 2)
  782 29570995 [unknown (0x10F)] refdbs 288 kill_pgrp: pid 288, sig 2
 2183 29573178 [unknown (0x10F)] refdbs 288 pinfo::init: execed process windows pid 105, cygwin pid 299
  855 29574033 [unknown (0x10F)] refdbs 288 kill_pgrp: killing pid 288, pgrp 288, p->ctty 1073741824, myself->ctty 1073741824
  729 29574762 [unknown (0x10F)] refdbs 288 sig_send: pid 288, signal 2, its_me 1
  697 29575459 [unknown (0x10F)] refdbs 288 sig_send: Waiting for thiscomplete 0x84
---------------------------------------------------------------------


-- 
Markus Hoenicka, PhD
UT Houston Medical School
Dept. of Integrative Biology and Pharmacology
6431 Fannin MSB4.114
Houston, TX 77030
(713) 500-6313, -7477
(713) 500-7444 (fax)
Markus.Hoenicka@uth.tmc.edu
http://ourworld.compuserve.com/homepages/hoenicka_markus/


--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]