Problems with native Unix domain sockets on Win 10/2019

Ken Brown kbrown@cornell.edu
Sat Jan 30 16:00:13 GMT 2021


On 9/28/2020 7:03 AM, Michael McMahon wrote:
> 
> 
> On 26/09/2020 08:30, Michael McMahon via Cygwin wrote:
>> 
>> 
>> On 25/09/2020 21:30, Ken Brown wrote:
>>> On 9/25/2020 2:50 PM, Ken Brown via Cygwin wrote:
>>>> On 9/25/2020 10:29 AM, Michael McMahon wrote:
>>>>> 
>>>>> 
>>>>> On 25/09/2020 14:19, Ken Brown wrote:
>>>>>> On 9/24/2020 8:01 AM, Michael McMahon wrote:
>>>>>>> 
>>>>>>> 
>>>>>>> On 24/09/2020 12:26, Ken Brown wrote:
>>>>>>>> On 9/23/2020 7:25 AM, Michael McMahon via Cygwin wrote:
>>>>>>>>> Hi,
>>>>>>>>> 
>>>>>>>>> I searched for related issues but haven't found anything.
>>>>>>>>> 
>>>>>>>>> I am having some trouble with Windows native Unix domain
>>>>>>>>> sockets (a recent feature in Windows 10 and 2019 server) and
>>>>>>>>> Cygwin. I think I possibly know the cause since I had to
>>>>>>>>> investigate a similar looking issue on another platform built
>>>>>>>>> on Windows.
>>>>>>>>> 
>>>>>>>>> The problem is that cygwin commands don't seem to recognise
>>>>>>>>> native Unix domain sockets correctly. For example, the socket
>>>>>>>>> "foo.sock" should have the same ownership and similar
>>>>>>>>> permissions to other files in the example below:
>>>>>>>>> 
>>>>>>>>> $ ls -lrt total 2181303
>>>>>>>>> 
>>>>>>>>> -rw-r--r--  1 mimcmah      None             1259   Sep 23
>>>>>>>>> 10:22 test.c -rwxr-xr-x  1 mimcmah      None             3680
>>>>>>>>> Sep 23 10:22 test.obj -rwxr-xr-x  1 mimcmah      None
>>>>>>>>> 121344 Sep 23 10:22 test.exe -rw-r-----  1 Unknown+User
>>>>>>>>> Unknown+Group         0 Sep 23 10:23 foo.sock -rw-r--r--  1
>>>>>>>>> mimcmah      None             144356 Sep 23 10:27 check.ot
>>>>>>>>> 
>>>>>>>>> A bigger problem is that foo.sock can't be deleted with the
>>>>>>>>> cygwin "rm" command.
>>>>>>>>> 
>>>>>>>>> $ rm -f foo.sock rm: cannot remove 'foo.sock': Permission
>>>>>>>>> denied
>>>>>>>>> 
>>>>>>>>> $ chmod 777 foo.sock chmod: changing permissions of
>>>>>>>>> 'foo.sock': Permission denied
>>>>>>>>> 
>>>>>>>>> $ cmd /c del foo.sock
>>>>>>>>> 
>>>>>>>>> But, native Windows commands are okay, as the third example
>>>>>>>>> shows.
>>>>>>>>> 
>>>>>>>>> I think the problem may relate to the way native Unix domain
>>>>>>>>> sockets are implemented in Windows and the resulting special
>>>>>>>>> handling required. They are implemented as NTFS reparse
>>>>>>>>> points and when opening them with CreateFile, you need to
>>>>>>>>> specify the FILE_FLAG_OPEN_REPARSE_POINT flag. Otherwise, you
>>>>>>>>> get an ERROR_CANT_ACCESS_FILE. There are other complications
>>>>>>>>> unfortunately, which I'd be happy to discuss further.
>>>>>>>>> 
>>>>>>>>> But, to reproduce it, you can compile the attached code
>>>>>>>>> snippet which creates foo.sock in the current directory.
>>>>>>>>> Obviously, this only works on recent versions of Windows 10
>>>>>>>>> and 2019 server.
>>>>>>>> 
>>>>>>>> Cygwin doesn't currently support native Windows AF_UNIX
>>>>>>>> sockets, as you've discovered.  See
>>>>>>>> 
>>>>>>>> https://urldefense.com/v3/__https://cygwin.com/pipermail/cygwin/2020-June/245088.html__;!!GqivPVa7Brio!P7lIFI4rYAtWh8_DtCbRCxT-M_E4vwQ0qwzQ0p656T73BpJ0jbUkLI_bXdA6mmSL9lJcSQ$
>>>>>>>> 
>>>>>>>> 
>>>>>>>> for the current state of AF_UNIX sockets on Cygwin, including
>>>>>>>> the possibility of using native Windows AF_UNIX sockets on
>>>>>>>> systems that support them.
>>>>>>>> 
>>>>>>>> If all you want is for Cygwin to recognize such sockets and
>>>>>>>> allow you to apply rm, chmod, etc., I don't think it would be
>>>>>>>> hard to add that capability.  But I doubt if that's all you
>>>>>>>> want.
>>>>>>>> 
>>>>>>>> Further discussion of this will have to wait until Corinna is
>>>>>>>> available.
>>>>>>>> 
>>>>>>> 
>>>>>>> Thanks for the info. It's mainly about recognition of sockets
>>>>>>> for regular commands. Since these objects can exist on Windows
>>>>>>> filesystems now, potentially created by any kind of Windows
>>>>>>> application, it would be great if Cygwin could handle them,
>>>>>>> irrespective of whether the Cygwin development environment does.
>>>>>>> Though that sounds like a good idea too.
>>>>>> 
>>>>>> I think this has a simple fix (attached), but I can't easily test
>>>>>> it because your test program doesn't compile for me.  First, I got
>>>>>> 
>>>>>> $ gcc -o native_unix_socket native_unix_socket.c 
>>>>>> native_unix_socket.c:5:10: fatal error: WS2tcpip.h: No such file or
>>>>>> directory 5 | #include <WS2tcpip.h> |          ^~~~~~~~~~~~ 
>>>>>> compilation terminated.
>>>>>> 
>>>>>> I fixed this by making the include file name lower case.  (My
>>>>>> system is case sensitive, so it matters.)
>>>>>> 
>>>>>> Next:
>>>>>> 
>>>>>> $ gcc -o native_unix_socket native_unix_socket.c 
>>>>>> native_unix_socket.c:8:10: fatal error: afunix.h: No such file or
>>>>>> directory 8 | #include <afunix.h> |          ^~~~~~~~~~ compilation
>>>>>> terminated.
>>>>>> 
>>>>>> There's no file afunix.h in the Cygwin distribution, but I located
>>>>>> it online and pasted in the contents.  The program now compiles but
>>>>>> fails to link:
>>>>>> 
>>>>>> $ gcc -o native_unix_socket native_unix_socket.c 
>>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld:
>>>>>>  /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x3b): undefined
>>>>>> reference to `__imp_WSAStartup' 
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x3b): relocation
>>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol
>>>>>> `__imp_WSAStartup' 
>>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld:
>>>>>>  /tmp/cc74urPr.o:native_unix_socket.c:(.text+0xf2): undefined
>>>>>> reference to `__imp_WSAGetLastError' 
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0xf2): relocation
>>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol
>>>>>> `__imp_WSAGetLastError' 
>>>>>> /usr/lib/gcc/x86_64-pc-cygwin/10/../../../../x86_64-pc-cygwin/bin/ld:
>>>>>>  /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x13d): undefined
>>>>>> reference to `__imp_WSAGetLastError' 
>>>>>> /tmp/cc74urPr.o:native_unix_socket.c:(.text+0x13d): relocation
>>>>>> truncated to fit: R_X86_64_PC32 against undefined symbol
>>>>>> `__imp_WSAGetLastError' collect2: error: ld returned 1 exit status
>>>>>> 
>>>>>> This is probably easy to fix too, but I don't feel like tracking it
>>>>>> down. Please send compilation instructions (that use Cygwin
>>>>>> tools).
>>>>>> 
>>>>>> Ken
>>>>> 
>>>>> Hi
>>>>> 
>>>>> Sorry, I had compiled it in a native Visual C environment.
>>>>> 
>>>>> Assuming you have afunix.h in the current directory.
>>>>> 
>>>>> gcc -o native_unix_socket -I. native_unix_socket.c -lws2_32
>>>>> 
>>>>> should do it.
>>>> 
>>>> Thanks, that works.  But now I can't reproduce your problem.  Here's
>>>> what I see, using Cygwin 3.1.7 without applying my patch:
>>>> 
>>>> $ ./native_unix_socket.exe getsockname works fam = 1, len = 11 offsetof
>>>> clen = 9 strlen = 8 name = foo.sock
>>>> 
>>>> $ ls -l foo.sock -rwxr-xr-x 1 kbrown None 0 2020-09-25 14:39 foo.sock*
>>>> 
>>>> $ chmod 644 foo.sock
>>>> 
>>>> $ ls -l foo.sock -rw-r--r-- 1 kbrown None 0 2020-09-25 14:39 foo.sock
>>>> 
>>>> $ rm foo.sock
>>>> 
>>>> $ ls -l foo.sock ls: cannot access 'foo.sock': No such file or
>>>> directory
>>>> 
>>>> I'm running 64-bit Cygwin on Windows 10 1909.
>>> 
>>> I just ran the 'rm' command under gdb to see what's going on, and it
>>> seems that foo.sock is not being recognized as a reparse point.  So maybe
>>> your test program, when compiled and run under Cygwin, doesn't actually
>>> produce a native Windows AF_UNIX socket.  And when I try to run it in a
>>> Windows Command Prompt, I get
>>> 
>>> bind failed 10050 getsockname failed 10022
>>> 
>>> Can you make your version of the test executable available for me to try?
>>> Or tell me some other way to create a native Windows AF_UNIX socket?
>>> 
>>> Ken
>> 
>> That is all very strange. I have checked both the gcc compiled and MS 
>> compiled executables on my system (2019 server) and they are both 
>> definitely producing native AF_UNIX sockets.
>> 
>> I can email you the two exe files. They are both quite small. But, first I
>> want to check the patch status of my test system.
>> 
> 
> So, it turns out that this issue only happens on some of our test systems. It
> does not happen on a personal copy of Windows 10 on my laptop either.
> 
> I also noticed that some native Windows commands don't work properly on any
> affected system (eg 'attrib' or 'fsutil'). Though 'fsutil' can be used to
> verify that the reparse point is created correctly.
> 
> Possibly, this was a Windows bug that has been fixed. It never made sense
> that you had to open socket files using the FILE_FLAG_OPEN_REPARSE_POINT 
> flag, because you would have to know in advance that the file is a socket to
> be able to do this (or else be prepared to have to open the file twice). But,
> I don't fully understand yet, why some systems are affected and others not. 
> All seem to be patched up to date.
> 
> In any case, I think it's clear this isn't a Cygwin issue.

It turns out that this is a Cygwin issue after all.  In a private message
Michael has said:

> From what I can see, the only versions that are *not* affected by the problem
> are 1903 and 1909 (which you tested with).  Versions I have tested with since
> then (2004, and 20H2) all show the problem.

I can't immediately test it myself because I'm still on 1909.  But I'll send a 
patch to cygwin-patches that I think should fix it, along with Michael's test 
program.

Ken


More information about the Cygwin mailing list