This is the mail archive of the cygwin mailing list for the Cygwin project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: (call-process ...) hangs in emacs

On 8/7/2014 8:51 AM, Corinna Vinschen wrote:
Hi Ken,

On Aug  7 07:51, Ken Brown wrote:
Hi Corinna,

On 8/5/2014 2:40 PM, Corinna Vinschen wrote:
I'm glad to read that, but I'm still a little bit concerned.  If your
code works with ERRORCHECK mutexes but hangs with NORMAL mutexes, you
*might* miss an error case.

I'd suggest to tweak the pthread_mutex_lock/unlock calls and log the
threads calling it.  It looks like the same thread calls malloc from
malloc for some reason and it might be interesting to learn how that
happens and if it's really ok in this scenario, because it seems to
be unexpected by the code.

I think I found the problem with NORMAL mutexes.  emacs calls pthread_atfork
after initializing the mutexes, and the resulting 'prepare' handler locks
the mutexes.  (The parent and child handlers unlock them.)  So when emacs
calls fork, the mutexes are locked, and shortly thereafter the Cygwin DLL
calls calloc, leading to a deadlock. Here's a gdb backtrace showing the
sequence of calls:

First question:  Why does emacs use its own malloc on Cygwin rather
than the system-provided one?  Is that really necessary?

Cygwin's malloc lacks a few features that emacs requires because of the unusual way emacs is built. The most important such features (or maybe even the only ones) are malloc_set_state and malloc_get_state.

#0  malloc (size=size@entry=40) at gmalloc.c:919
#1  0x0053fc28 in calloc (nmemb=1, size=40) at gmalloc.c:1510
#2  0x61082074 in calloc (nmemb=1, size=40)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/
#3  0x61003177 in operator new (s=s@entry=40)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/
#4  0x610fc9d3 in pthread_mutex::init (mutex=0x61187d34 <reent_data+852>,
     attr=0x0, initializer=0x12)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/
#5  0x610fcc13 in pthread_mutex_lock (mutex=0x61187d34 <reent_data+852>)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/
#6  0x611319d8 in __fp_lock (ptr=0x61187cd0 <reent_data+752>)

Right, __fp_lock needs a pthread lock and since this lock hasn't been
used yet, it has to create it.  The pthread_mutex creation calls the
new operator which in turn calls calloc.

     at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:287
#7  0x61154f75 in _fwalk (ptr=0x28d544,
     function=function@entry=0x611319c0 <__fp_lock>)
     at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/fwalk.c:50
#8  0x61131dea in __fp_lock_all ()
     at /usr/src/debug/cygwin-1.7.31-3/newlib/libc/stdio/findfp.c:307
#9  0x610fa45e in pthread::atforkprepare ()
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/
#10 0x61076292 in lock_pthread (this=<synthetic pointer>)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:137
#11 hold_everything (x=<synthetic pointer>, this=<synthetic pointer>)
     at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/sigproc.h:169
#12 fork () at /usr/src/debug/cygwin-1.7.31-3/winsup/cygwin/

Is there a better way to deal with this issue than using ERRORCHECK mutexes?

Did you check if you get an error from pthread_mutex_lock on the
second invocation of malloc?  Is it EDEADLK?  If so, you can
ignore the error, but if you want to go ahead without adding lots
of error checking you might be better off using a RECURSIVE mutex.

I didn't check the error, but it seemed clear from the code that that was what was happening. Yes, using a RECURSIVE mutex sounds like a good idea. Or maybe it would be just as good to remove the call to pthread_atfork. See my reply to Eric later in the thread.


Problem reports:
Unsubscribe info:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]