Deadlock of the process tree when running make
Takashi Yano
takashi.yano@nifty.ne.jp
Wed Apr 13 17:22:06 GMT 2022
On Wed, 13 Apr 2022 19:48:04 +0300
Alexey Izbyshev wrote:
> On 2022-04-11 13:10, Alexey Izbyshev wrote:
> > On 2022-04-11 11:35, Takashi Yano wrote:
> >> On Sun, 10 Apr 2022 23:49:29 +0300
> >> A countermeasure version is available at the following location:
> >> https://tyan0.yr32.net/cygwin/x86/test/cygwin1-20220411.dll.xz
> >> https://tyan0.yr32.net/cygwin/x86_64/test/cygwin1-20220411.dll.xz
> >>
> >> Could you please test? To keep the hanging tree, please install
> >> cygwin another directory, and replace cygwin1.dll with the
> >> countermeasure version.
> >>
> > Thank you for providing the binaries! I've started testing in a
> > separate cygwin installation on the same machine, as you suggested.
> > The hang previously took many hours to reproduce, so I'll keep tests
> > running for a while and then report back.
> >
> The good news is that the tests have been running for two days so far
> without any cygwin-related issues, so the patched version doesn't seem
> to introduce new issues.
>
> The bad news is my theory about the suspicious "Unnamed file:
> \FileSystem\Npfs" in the hanging bash.exe being a leak seems to be
> wrong. I've closed that handle, but conhost.exe hasn't unblocked. All of
> its threads are doing the same things as before:
>
> 1. Tries to enter a critical section. (Task Manager claims it waits for
> thread 4, so probably the latter owns it).
> 2. ReadFile("pty1-from-master-nat" named pipe)
> 3. Waits for an anonymous event.
> 4. Waits on a handle for "\Device\ConDrv" (in DeviceIoControl()).
> 5. Blocked in GetMessageW().
>
> I've created a model situation with bash.exe stopped at a breakpoint in
> ClosePseudoConsole() at another machine again, and it seems that the
> last time I missed that bash.exe contains *two* handles for (different)
> "Unnamed file: \FileSystem\Npfs" here too, so it seems to be normal.
>
> What's probably not normal is the behavior of the hanging conhost.exe.
> I've compared the points where conhost.exe is blocked, and all but one
> threads in the model case are doing the same things as in the hanging
> case, but the remaining thread is blocked in
> ReadFile("\Device\NamedPipe\") (i.e. the read end of "hWritePipe" of
> pcon) instead of trying to enter a critical section like thread 1 above.
> So now I'm starting to doubt that it's a cygwin bug and not some
> conhost.exe bug.
>
> I'll try to poke around the hanging conhost.exe some more, and also may
> be will try to create a faster reproducer.
Thanks for testing.
Question is:
Is the issue reproduced using new cygwin1.dll? Or is it still
running without the issue so far?
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin
mailing list