This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: hang in sig_wait waiting for debug lock


On Fri, Sep 06, 2002 at 08:17:13PM +0400, egor duda wrote:
>Hi!
>
>Friday, 06 September, 2002 Christopher Faylor cgf@redhat.com wrote:
>
>>>Changelog states, however, that setclexec stuff isn't needed. Yet i
>>>can't see why we shouldn't process protected handle list as long as we
>>>recreating handles during set-close-on-exec operation. Can you give a
>>>comment?
>
>CF> I assume that you mean this entry:
>
>CF> 2002-07-14  Christopher Faylor  <cgf@redhat.com>
>
>CF>         * dcrt0.cc (dll_crt0_1): Move debug_init call back to here.  Avoid a
>CF>         compiler warning.
>CF>         * shared.cc (memory_init): Remove debug_init call.
>CF>         * debug.h (handle_list): Change "clexec" to "inherited".
>CF>         * debug.cc: Remove a spurious declaration.
>CF>         (setclexec): Conditionalize away since it is currently unused.
>CF>         (add_handle): Use inherited field rather than clexec.
>CF>         (debug_fixup_after_fork_exec): Ditto.  Move debugging output to
>CF>         delete_handle.
>CF>         (delete_handle): Add debugging output.
>CF>         * fhandler.cc (fhandler_base::set_inheritance): Don't bother setting
>CF>         inheritance in debugging table since the handle was never protected
>CF>         anyway.
>CF>         (fhandler_base::fork_fixup): Ditto.
>
>CF> I'm at a loss to understand why adding additional things into the
>CF> protected handle table would solve a race.
>
>I thought about it again and here's a hypothesis of what may be
>happening.
>
>I suspect that it's not exactly a race. I.e., it's caused not by
>randomness in order in which different threads of control are
>executed, but by randomness in which handles are allocated by OS.
>If value of some handle allocated in one process is equal to value
>of handle we were dealing with in other, we may got warnings from
>add_handle.

I don't see this either.  If the table is populated with handles from
another process that don't exist in this one then that's a bug.

>system_printf is pumping data to STD_ERROR_HANDLE. It's possibly a
>pipe to tty master. Handling data in tty master thread is quite
>complicated, and may possibly get to the same add_handle() but with
>muto already locked. Normally it's not a big problem since
>system_printf() will return asynchronously to tty master and unlock
>the mutex. But here we have the second nasty random thing that may
>happen: The pipe may be filled up. In this case WriteFile in
>system_printf blocked until master drain the data from pipe. And
>master may be blocked because it wants to protect a handle but debug
>muto is locked.
>
>I've noticed special here.unlock() before debug_printf() in
>add_handle(). Could it be that it was added there for similar reasons?
>If not, then it's not clear why we should unlock mute explicitly when
>it will be unlocked in the next line when 'return' statement is
>executed?

Possibly.  Are you seeing system_printf output in your failing case?

cgf


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]