(call-process ...) hangs in emacs
Thu Aug 7 18:54:00 GMT 2014
On 8/7/2014 8:51 AM, Corinna Vinschen wrote:
> Hi Ken,
> On Aug 7 07:51, Ken Brown wrote:
>> Hi Corinna,
>> On 8/5/2014 2:40 PM, Corinna Vinschen wrote:
>>> I'm glad to read that, but I'm still a little bit concerned. If your
>>> code works with ERRORCHECK mutexes but hangs with NORMAL mutexes, you
>>> *might* miss an error case.
>>> I'd suggest to tweak the pthread_mutex_lock/unlock calls and log the
>>> threads calling it. It looks like the same thread calls malloc from
>>> malloc for some reason and it might be interesting to learn how that
>>> happens and if it's really ok in this scenario, because it seems to
>>> be unexpected by the code.
>> I think I found the problem with NORMAL mutexes. emacs calls pthread_atfork
>> after initializing the mutexes, and the resulting 'prepare' handler locks
>> the mutexes. (The parent and child handlers unlock them.) So when emacs
>> calls fork, the mutexes are locked, and shortly thereafter the Cygwin DLL
>> calls calloc, leading to a deadlock. Here's a gdb backtrace showing the
>> sequence of calls:
> First question: Why does emacs use its own malloc on Cygwin rather
> than the system-provided one? Is that really necessary?
Cygwin's malloc lacks a few features that emacs requires because of the
unusual way emacs is built. The most important such features (or maybe
even the only ones) are malloc_set_state and malloc_get_state.
>> #0 malloc (size=size@entry=40) at gmalloc.c:919
>> #1 0x0053fc28 in calloc (nmemb=1, size=40) at gmalloc.c:1510
>> #2 0x61082074 in calloc (nmemb=1, size=40)
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/malloc_wrapper.cc:100
>> #3 0x61003177 in operator new (s=s@entry=40)
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/cxx.cc:23
>> #4 0x610fc9d3 in pthread_mutex::init (mutex=0x61187d34 <reent_data+852>,
>> attr=0x0, initializer=0x12)
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:3118
>> #5 0x610fcc13 in pthread_mutex_lock (mutex=0x61187d34 <reent_data+852>)
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:3170
>> #6 0x611319d8 in __fp_lock (ptr=0x61187cd0 <reent_data+752>)
> Right, __fp_lock needs a pthread lock and since this lock hasn't been
> used yet, it has to create it. The pthread_mutex creation calls the
> new operator which in turn calls calloc.
>> at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:287
>> #7 0x61154f75 in _fwalk (ptr=0x28d544,
>> function=function@entry=0x611319c0 <__fp_lock>)
>> at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/fwalk.c:50
>> #8 0x61131dea in __fp_lock_all ()
>> at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:307
>> #9 0x610fa45e in pthread::atforkprepare ()
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/thread.cc:2031
>> #10 0x61076292 in lock_pthread (this=<synthetic pointer>)
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:137
>> #11 hold_everything (x=<synthetic pointer>, this=<synthetic pointer>)
>> at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:169
>> #12 fork () at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/fork.cc:582
>> Is there a better way to deal with this issue than using ERRORCHECK mutexes?
> Did you check if you get an error from pthread_mutex_lock on the
> second invocation of malloc? Is it EDEADLK? If so, you can
> ignore the error, but if you want to go ahead without adding lots
> of error checking you might be better off using a RECURSIVE mutex.
I didn't check the error, but it seemed clear from the code that that
was what was happening. Yes, using a RECURSIVE mutex sounds like a good
idea. Or maybe it would be just as good to remove the call to
pthread_atfork. See my reply to Eric later in the thread.
Problem reports: http://cygwin.com/problems.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin