Problems with native Unix domain sockets on Win 10/2019
Ken Brown
kbrown@cornell.edu
Sat Jan 30 16:00:13 GMT 2021
On 9/28/2020 7:03 AM, Michael McMahon wrote:
>
>
> On 26/09/2020 08:30, Michael McMahon via Cygwin wrote:
>>
>>
>> On 25/09/2020 21:30, Ken Brown wrote:
>>> On 9/25/2020 2:50 PM, Ken Brown via Cygwin wrote:
>>>> On 9/25/2020 10:29 AM, Michael McMahon wrote:
>>>>>
>>>>>
>>>>> On 25/09/2020 14:19, Ken Brown wrote:
>>>>>> On 9/24/2020 8:01 AM, Michael McMahon wrote:
>>>>>>>
>>>>>>>
>>>>>>> On 24/09/2020 12:26, Ken Brown wrote:
>>>>>>>> On 9/23/2020 7:25 AM, Michael McMahon via Cygwin wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>> I searched for related issues but haven't found anything.
>>>>>>>>>
>>>>>>>>> I am having some trouble with Windows native Unix domain
>>>>>>>>> sockets (a recent feature in Windows 10 and 2019 server) and
>>>>>>>>> Cygwin. I think I possibly know the cause since I had to
>>>>>>>>> investigate a similar looking issue on another platform built
>>>>>>>>> on Windows.
>>>>>>>>>
>>>>>>>>> The problem is that cygwin commands don't seem to recognise
>>>>>>>>> native Unix domain sockets correctly. For example, the socket
>>>>>>>>> "foo.sock" should have the same ownership and similar
>>>>>>>>> permissions to other files in the example below:
>>>>>>>>>
>>>>>>>>> $ ls -lrt total 2181303
>>>>>>>>>
>>>>>>>>> -rw-r--r-- 1 mimcmah None 1259 Sep 23
>>>>>>>>> 10:22 test.c -rwxr-xr-x 1 mimcmah None 3680
>>>>>>>>> Sep 23 10:22 test.obj -rwxr-xr-x 1 mimcmah None
>>>>>>>>> 121344 Sep 23 10:22 test.exe -rw-r----- 1 Unknown+User
>>>>>>>>> Unknown+Group 0 Sep 23 10:23 foo.sock -rw-r--r-- 1
>>>>>>>>> mimcmah None 144356 Sep 23 10:27 check.ot
>>>>>>>>>
>>>>>>>>> A bigger problem is that foo.sock can't be deleted with the
>>>>>>>>> cygwin "rm" command.
>>>>>>>>>
>>>>>>>>> $ rm -f foo.sock rm: cannot remove 'foo.sock': Permission
>>>>>>>>> denied
>>>>>>>>>
>>>>>>>>> $ chmod 777 foo.sock chmod: changing permissions of
>>>>>>>>> 'foo.sock': Permission denied
>>>>>>>>>
>>>>>>>>> $ cmd /c del foo.sock
>>>>>>>>>
>>>>>>>>> But, native Windows commands are okay, as the third example
>>>>>>>>> shows.
>>>>>>>>>
>>>>>>>>> I think the problem may relate to the way native Unix domain
>>>>>>>>> sockets are implemented in Windows and the resulting special
>>>>>>>>> handling required. They are implemented as NTFS reparse
>>>>>>>>> points and when opening them with CreateFile, you need to
>>>>>>>>> specify the FILE_FLAG_OPEN_REPARSE_POINT flag. Otherwise, you
>>>>>>>>> get an ERROR_CANT_ACCESS_FILE. There are other complications
>>>>>>>>> unfortunately, which I'd be happy to discuss further.
>>>>>>>>>
>>>>>>>>> But, to reproduce it, you can compile the attached code
>>>>>>>>> snippet which creates foo.sock in the current directory.
>>>>>>>>> Obviously, this only works on recent versions of Windows 10
>>>>>>>>> and 2019 server.
>>>>>>>>
>>>>>>>> Cygwin doesn't currently support native Windows AF_UNIX
>>>>>>>> sockets, as you've discovered. See
>>>>>>>>
>>>>>>>> https://urldefense.com/v3/__https://cygwin.com/pipermail/cygwin/2020-June/245088.html__;!!GqivPVa7Brio!P7lIFI4rYAtWh8_DtCbRCxT-M_E4vwQ0qwzQ0p656T73BpJ0jbUkLI_bXdA6mmSL9lJcSQ$
>>>>>>>>
>>>>>>>>
>>>>>>>> for the current state of AF_UNIX sockets on Cygwin, including
>>>>>>>> the possibility of using native Windows AF_UNIX sockets on
>>>>>>>> systems that support them.
>>>>>>>>
>>>>>>>> If all you want is for Cygwin to recognize such sockets and
>>>>>>>> allow you to apply rm, chmod, etc., I don't think it would be
>>>>>>>> hard to add that capability. But I doubt if that's all you
>>>>>>>> want.
>>>>>>>>
>>>>>>>> Further discussion of this will have to wait until Corinna is
>>>>>>>> available.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks for the info. It's mainly about recognition of sockets
>>>>>>> for regular commands. Since these objects can exist on Windows
>>>>>>> filesystems now, potentially created by any kind of Windows
>>>>>>> application, it would be great if Cygwin could handle them,
>>>>>>> irrespective of whether the Cygwin development environment does.
>>>>>>> Though that sounds like a good idea too.
>>>>>>
>>>>>> I think this has a simple fix (attached), but I can't easily test
>>>>>> it because your test program doesn't compile for me. First, I got
>>>>>>
>>>>>> $ gcc -o native_unix_socket native_unix_socket.c
>>>>>> native_unix_socket.c:5:10: fatal error: WS2tcpip.h: No such file or
>>>>>> directory 5 | #include <WS2tcpip.h> | ^~~~~~~~~~~~
>>>>>> compilation terminated.
>>>>>>
>>>>>> I fixed this by making the include file name lower case. (My
>>>>>> system is case sensitive, so it matters.)
>>>>>>
>>>>>> Next:
>>>>>>
>>>>>> $ gcc -o native_unix_socket native_unix_socket.c
>>>>>> native_unix_socket.c:8:10: fatal error: afunix.h: No such file or
>>>>>> directory 8 | #include <afunix.h> | ^~~~~~~~~~ compilation
>>>>>> terminated.
>>>>>>
>>>>>> There's no file afunix.h in the Cygwin distribution, but I located
>>>>>> it online and pasted in the contents. The program now compiles but
>>>>>> fails to link:
>>>>>>
>>>>>> $ gcc -o native_unix_socket native_unix_socket.c
>>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld:
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x3b): undefined
>>>>>> reference to `__imp_WSAStartup'
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x3b): relocation
>>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol
>>>>>> `__imp_WSAStartup'
>>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld:
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0xf2): undefined
>>>>>> reference to `__imp_WSAGetLastError'
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0xf2): relocation
>>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol
>>>>>> `__imp_WSAGetLastError'
>>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld:
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x13d): undefined
>>>>>> reference to `__imp_WSAGetLastError'
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x13d): relocation
>>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol
>>>>>> `__imp_WSAGetLastError' collect2: error: ld returned 1 exit status
>>>>>>
>>>>>> This is probably easy to fix too, but I don't feel like tracking it
>>>>>> down. Please send compilation instructions (that use Cygwin
>>>>>> tools).
>>>>>>
>>>>>> Ken
>>>>>
>>>>> Hi
>>>>>
>>>>> Sorry, I had compiled it in a native Visual C environment.
>>>>>
>>>>> Assuming you have afunix.h in the current directory.
>>>>>
>>>>> gcc -o native_unix_socket -I. native_unix_socket.c -lws2_32
>>>>>
>>>>> should do it.
>>>>
>>>> Thanks, that works. But now I can't reproduce your problem. Here's
>>>> what I see, using Cygwin 3.1.7 without applying my patch:
>>>>
>>>> $ ./native_unix_socket.exe getsockname works fam = 1, len = 11 offsetof
>>>> clen = 9 strlen = 8 name = foo.sock
>>>>
>>>> $ ls -l foo.sock -rwxr-xr-x 1 kbrown None 0 2020-09-25 14:39 foo.sock*
>>>>
>>>> $ chmod 644 foo.sock
>>>>
>>>> $ ls -l foo.sock -rw-r--r-- 1 kbrown None 0 2020-09-25 14:39 foo.sock
>>>>
>>>> $ rm foo.sock
>>>>
>>>> $ ls -l foo.sock ls: cannot access 'foo.sock': No such file or
>>>> directory
>>>>
>>>> I'm running 64-bit Cygwin on Windows 10 1909.
>>>
>>> I just ran the 'rm' command under gdb to see what's going on, and it
>>> seems that foo.sock is not being recognized as a reparse point. So maybe
>>> your test program, when compiled and run under Cygwin, doesn't actually
>>> produce a native Windows AF_UNIX socket. And when I try to run it in a
>>> Windows Command Prompt, I get
>>>
>>> bind failed 10050 getsockname failed 10022
>>>
>>> Can you make your version of the test executable available for me to try?
>>> Or tell me some other way to create a native Windows AF_UNIX socket?
>>>
>>> Ken
>>
>> That is all very strange. I have checked both the gcc compiled and MS
>> compiled executables on my system (2019 server) and they are both
>> definitely producing native AF_UNIX sockets.
>>
>> I can email you the two exe files. They are both quite small. But, first I
>> want to check the patch status of my test system.
>>
>
> So, it turns out that this issue only happens on some of our test systems. It
> does not happen on a personal copy of Windows 10 on my laptop either.
>
> I also noticed that some native Windows commands don't work properly on any
> affected system (eg 'attrib' or 'fsutil'). Though 'fsutil' can be used to
> verify that the reparse point is created correctly.
>
> Possibly, this was a Windows bug that has been fixed. It never made sense
> that you had to open socket files using the FILE_FLAG_OPEN_REPARSE_POINT
> flag, because you would have to know in advance that the file is a socket to
> be able to do this (or else be prepared to have to open the file twice). But,
> I don't fully understand yet, why some systems are affected and others not.
> All seem to be patched up to date.
>
> In any case, I think it's clear this isn't a Cygwin issue.
It turns out that this is a Cygwin issue after all. In a private message
Michael has said:
> From what I can see, the only versions that are *not* affected by the problem
> are 1903 and 1909 (which you tested with). Versions I have tested with since
> then (2004, and 20H2) all show the problem.
I can't immediately test it myself because I'm still on 1909. But I'll send a
patch to cygwin-patches that I think should fix it, along with Michael's test
program.
Ken
More information about the Cygwin
mailing list