This is the mail archive of the cygwin-developers@cygwin.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: hang in sig_wait waiting for debug lock


Hi!

Saturday, 07 September, 2002 Christopher Faylor cgf@redhat.com wrote:

CF> On Fri, Sep 06, 2002 at 08:17:13PM +0400, egor duda wrote:
>>Hi!
>>
>>Friday, 06 September, 2002 Christopher Faylor cgf@redhat.com wrote:
>>
>>>>Changelog states, however, that setclexec stuff isn't needed. Yet i
>>>>can't see why we shouldn't process protected handle list as long as we
>>>>recreating handles during set-close-on-exec operation. Can you give a
>>>>comment?
>>
>>CF> I assume that you mean this entry:
>>
>>CF> 2002-07-14  Christopher Faylor  <cgf@redhat.com>
>>
>>CF>         * dcrt0.cc (dll_crt0_1): Move debug_init call back to here.  Avoid a
>>CF>         compiler warning.
>>CF>         * shared.cc (memory_init): Remove debug_init call.
>>CF>         * debug.h (handle_list): Change "clexec" to "inherited".
>>CF>         * debug.cc: Remove a spurious declaration.
>>CF>         (setclexec): Conditionalize away since it is currently unused.
>>CF>         (add_handle): Use inherited field rather than clexec.
>>CF>         (debug_fixup_after_fork_exec): Ditto.  Move debugging output to
>>CF>         delete_handle.
>>CF>         (delete_handle): Add debugging output.
>>CF>         * fhandler.cc (fhandler_base::set_inheritance): Don't bother setting
>>CF>         inheritance in debugging table since the handle was never protected
>>CF>         anyway.
>>CF>         (fhandler_base::fork_fixup): Ditto.
>>
>>CF> I'm at a loss to understand why adding additional things into the
>>CF> protected handle table would solve a race.
>>
>>I thought about it again and here's a hypothesis of what may be
>>happening.
>>
>>I suspect that it's not exactly a race. I.e., it's caused not by
>>randomness in order in which different threads of control are
>>executed, but by randomness in which handles are allocated by OS.
>>If value of some handle allocated in one process is equal to value
>>of handle we were dealing with in other, we may got warnings from
>>add_handle.

CF> I don't see this either.  If the table is populated with handles from
CF> another process that don't exist in this one then that's a bug.

It's clearly so. And enabling setclexec() removes it. I can guess,
that it's not the way things were supposed to work, right?

>>system_printf is pumping data to STD_ERROR_HANDLE. It's possibly a
>>pipe to tty master. Handling data in tty master thread is quite
>>complicated, and may possibly get to the same add_handle() but with
>>muto already locked. Normally it's not a big problem since
>>system_printf() will return asynchronously to tty master and unlock
>>the mutex. But here we have the second nasty random thing that may
>>happen: The pipe may be filled up. In this case WriteFile in
>>system_printf blocked until master drain the data from pipe. And
>>master may be blocked because it wants to protect a handle but debug
>>muto is locked.
>>
>>I've noticed special here.unlock() before debug_printf() in
>>add_handle(). Could it be that it was added there for similar reasons?
>>If not, then it's not clear why we should unlock mute explicitly when
>>it will be unlocked in the next line when 'return' statement is
>>executed?

CF> Possibly.  Are you seeing system_printf output in your failing case?

Yes, but it looks that it not full. Actually, pipe hasn't to be filled
up. After WriteFile() we call FlushFileBuffers (). Documentation on
FlushFileBuffers states that if handle is a pipe then the operation
blocks until pipe peer reads _all_ data from pipe. Commenting
FlushFileBuffers did helped in my case -- system_printf()s are not
blocking process now.

Yet i suppose the right way is to always unlock debug muto as fast as
possible, i.e., before calling debug_printf() and stuff. As far as i
understand, muto is protecting handle list, so we must hold it while
and only while we're messing with it. That should be safe.

Egor.            mailto:deo@logos-m.ru ICQ 5165414 FidoNet 2:5020/496.19


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]