Sv: Sv: Sv: Sv: Sv: Sv: Sv: Named pipes and multiple writers

Ken Brown kbrown@cornell.edu
Thu Apr 2 02:19:29 GMT 2020


On 4/1/2020 2:34 PM, Ken Brown via Cygwin wrote:
> On 4/1/2020 1:14 PM, sten.kristian.ivarsson@gmail.com wrote:
>>> On 4/1/2020 4:52 AM, sten.kristian.ivarsson@gmail.com wrote:
>>>>> On 3/31/2020 5:10 PM, sten.kristian.ivarsson@gmail.com wrote:
>>>>>>> On 3/28/2020 10:19 PM, Ken Brown via Cygwin wrote:
>>>>>>>> On 3/28/2020 11:43 AM, Ken Brown via Cygwin wrote:
>>>>>>>>> On 3/28/2020 8:10 AM, sten.kristian.ivarsson@gmail.com wrote:
>>>>>>>>>>> On 3/27/2020 10:53 AM, sten.kristian.ivarsson@gmail.com wrote:
>>>>>>>>>>>>> On 3/26/2020 7:19 PM, Ken Brown via Cygwin wrote:
>>>>>>>>>>>>>> On 3/26/2020 6:39 PM, Ken Brown via Cygwin wrote:
>>>>>>>>>>>>>>> On 3/26/2020 6:01 PM, sten.kristian.ivarsson@gmail.com wrote:
>>>>>>>>>>>>>>>> The ENIXIO occurs when parallel child-processes
>>>>>>>>>>>>>>>> simultaneously using O_NONBLOCK opening the descriptor.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is consistent with my guess that the error is
>>>>>>>>>>>>>>> generated by fhandler_fifo::wait.  I have a feeling that
>>>>>>>>>>>>>>> read_ready should have been created as a manual-reset
>>>>>>>>>>>>>>> event, and that more care is needed to make sure it's set
>> when it should be.
>>
>> [snip]
>>
>>>>>>>> Never mind.  I was able to reproduce the problem and find the cause.
>>>>>>>> What happens is that when the first subprocess exits,
>>>>>>>> fhandler_fifo::close resets read_ready.  That causes the second
>>>>>>>> and subsequent subprocesses to think that there's no reader open,
>>>>>>>> so their attempts to open a writer with O_NONBLOCK fail with ENXIO.
>>
>> [snip]
>>
>>>> I wrote in a previous mail in this topic that it seemed to work fine
>>>> for me as well, but when I bumped up the numbers of writers and/or the
>>>> number of messages (e.g. 25/25) it starts to fail again
>>
>> [snip]
>>
>>> Yes, it is a resource issue.  There is a limit on the number of writers
>> that can be open at one
>>> time, currently 64.  I chose that number arbitrarily, with no idea what
>> might actually be
>>> needed in practice, and it can easily be changed.
>>
>> Does it have to be a limit at all ? We would rather see that the application
>> decide how much resources it would like to use. In our particular case there
>> will be a process-manager with an incoming pipe that possible several
>> thousands of processes will write to
> 
> I agree.
> 
>> Just for fiddling around (to figure out if this is the limit that make other
>> things work a bit odd), where's this 64 limit defined now ?
> 
> It's MAX_CLIENTS, defined in fhandler.h.  But there seem to be other resource 
> issues also; simply increasing MAX_CLIENTS doesn't solve the problem.  I think 
> there are also problems with the number of threads, for example.  Each time your 
> program forks, the subprocess inherits the rfd file descriptor and its 
> "fifo_reader_thread" starts up.  This is unnecessary for your application, so I 
> tried disabling it (in fhandler_fifo::fixup_after_fork), just as an experiment.
> 
> But then I ran into some deadlocks, suggesting that one of the locks I'm using 
> isn't robust enough.  So I've got a lot of things to work on.
> 
>>> In addition, a writer isn't recognized as closed until a reader tries to
>> read and gets an error.
>>> In your example with 25/25, the list of writers quickly gets to 64 before
>> the parent ever tries
>>> to read.
>>
>> That explains the behaviour, but should there be some error returned from
>> open/write (maybe it is but I'm missing it) ?
> 
> The error is discovered in add_client_handler, called from thread_func.  I think 
> you'll only see it if you run the program under strace.  I'll see if I can find 
> a way to report it.  Currently, there's a retry loop in fhandler_fifo::open when 
> a writer tries to open, and I think I need to limit the number of retries and 
> then error out.

I pushed a few improvements and bug fixes, and your 25/25 example now runs 
without a problem.  I increased MAX_CLIENTS to 1024 just for the sake of this 
example, but I'll work on letting the number of writers increase dynamically as 
needed.

Ken


More information about the Cygwin mailing list