(call-process ...) hangs in emacs

Ken Brown kbrown@cornell.edu
Mon Aug 25 19:00:00 GMT 2014


On 8/18/2014 8:28 AM, Ken Brown wrote:
> On 8/8/2014 9:26 AM, Ken Brown wrote:
>> On 8/7/2014 5:42 PM, Eric Blake wrote:
>>> On 08/07/2014 12:53 PM, Ken Brown wrote:
>>>> On 8/7/2014 11:30 AM, Eric Blake wrote:
>>>>> On 08/07/2014 05:51 AM, Ken Brown wrote:
>>>>>>
>>>>>> I think I found the problem with NORMAL mutexes.  emacs calls
>>>>>> pthread_atfork after initializing the mutexes, and the resulting
>>>>>> 'prepare' handler locks the mutexes.  (The parent and child handlers
>>>>>> unlock them.)  So when emacs calls fork, the mutexes are locked, and
>>>>>> shortly thereafter the Cygwin DLL calls calloc, leading to a
>>>>>> deadlock.
>>>>>> Here's a gdb backtrace showing the sequence of calls:
>>>>>
>>>>> Arguably, that's an upstream bug in emacs.  POSIX has declared
>>>>> pthread_atfork to be fundamentally useless; it is broken by design,
>>>>> because you cannot use it for anything that is not async-signal-safe
>>>>> without risking deadlock.  And (except for sem_post()), NONE of the
>>>>> standardized locking functions are async-signal-safe.
>>>>>
>>>>> http://austingroupbugs.net/view.php?id=858
>>>>>
>>>>> That said, it would still be nice to support this, since even though
>>>>> the
>>>>> theory says it is broken, there are still lots of (broken)
>>>>> programs/libraries still trying to use it.
>>>>
>>>> So what do you think emacs should do instead of using
>>>> pthread_atfork? Or
>>>> is it better to just remove it?  I don't know how likely it is that
>>>> this
>>>> would cause a problem.
>>>
>>> The POSIX recommendation is that multithreaded apps limit themselves
>>> solely to async-signal-safe functions in the window between fork and
>>> exec (or to use pthread_spawn instead of fork/exec).  I don't know what
>>> emacs is trying to do in that window, but at this point, it's certainly
>>> worth reporting it upstream.  If you need a pointer to the full list of
>>> async-signal-safe functions:
>>>
>>> http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_04
>>>
>>>
>>> and search for "The following table defines a set of functions that
>>> shall be async-signal-safe."
>>>
>>> The most common deadlocks when violating async-signal-safety rules look
>>> like this in single-threaded programs:
>>>
>>> function calls malloc()
>>>    malloc() grabs a non-recursive mutex
>>>      async signal arrives
>>>        signal handler called
>>>          signal handler calls malloc()
>>>            malloc() can't grab the mutex - deadlock
>>>
>>> and this counterpart in multithreaded programs:
>>>
>>> thread1 calls malloc()
>>>    malloc() grabs a non-recursive mutex
>>> thread 2 gains control and calls fork()
>>>    because of the fork, thread1 no longer exists to release the lock
>>>    child process calls malloc()
>>>      malloc() tries to grab mutex, but it is locked with no thread to
>>> release it
>>>
>>> Switching malloc() to a recursive lock may or may not "solve" the
>>> single-threaded deadlock (in that malloc can now obtain the mutex), but
>>> it is probably NOT what you want to happen (unless malloc is fully
>>> re-entrant, the inner instance will see incomplete data and either be
>>> totally clobbered itself, or else totally clobber the outer instance
>>> when it returns).  So it's GOOD that malloc does NOT use a recursive
>>> mutex by default.
>>>
>>> In the multithreaded case, you are flat out hosed. Switching to a
>>> recursive lock does not change the picture - you are still deadlocked
>>> waiting on thread1 to release the lock, but thread1 doesn't exist.
>>
>> Thanks for the explanations, Eric.  I've filed an emacs bug report:
>>
>>    http://debbugs.gnu.org/cgi/bugreport.cgi?bug=18222
>
> I've just made a new emacs test release that includes a workaround for
> this bug.  I think I see a way to make emacs use Cygwin's malloc; if
> this works, it will provide a better fix for the bug.

It looks like my idea is going to work, but it needs testing to make 
sure I've implemented it correctly.  If anyone is willing to test it, 
you can download emacs-24.3.93-2 from my personal Cygwin repository:

   http://sanibeltranquility.com/cygwin/

Instructions can be found at that URL.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list