The new Arthur "two queues" Jackson* implementation of signals

Christopher Faylor
Fri Aug 22 19:28:00 GMT 2003

On Fri, Aug 22, 2003 at 02:31:24PM -0400, Pierre A. Humblet wrote:
>Christopher Faylor wrote:
>> It has bugged me that Pierre couldn't figure out why I needed two signal
>> queues ( in
>> the signal processing stuff so I've been mulling over my new
>> implementation for a while.  Yesterday, I knew precisely why I needed
>> two queues.  Today, I'm not so sure.
>The signal code is very tricky, I am amazed by how many inventive ideas
>went in there.
>Let's leave the two queues for now, but there are still some features
>in my tuneup patch that have not been integrated/discussed (the
>test in set_process_mask + avoiding the InterlockedDecrement/Increment
>flip flopping). There is also the mask race you discovered.
>I also saw a crash last night and just analyzed it:
>The stack picture at label 3 when starting the handler,
>or just after the call to _set_process_mask@4 when
>terminating the handler, is as follows:
>ret addr
>old ebp        <= ebp
>6 registers
>errno          <= esp
>signal arg
>signal handler
>So in the recursion elimination code the return address must be 
>overwritten at ebp + 4 (my solution) or esp + 36 (your solution)
>However your code
>        movl    %%esp,%%ebp
>        movl    %%eax,36(%%ebp) # Restore return address
>produces an ebp different from the one in the sketch above, 
>so when you return from the eliminated recursion the stack format 
>is wrong and a stack walk crashes. This only happens when 
>there are two consecutive recursions.
>Just leave ebp alone and change the two lines above to
>        movl    %%eax,36(%%esp)  or %%eax,4(%%ebp)

I don't think it is a good idea rely on the contents of ebp even though
it probably has been correctly restored.  In your patch, you could use
ebp before because it was set prior to calling _set_process_mask and
then 32 was subtracted from it.  I eliminated that code because it
wasn't doing anything.  That was, in fact, a previous abortive attempt
to eliminate recursion, I think.

I'm not sure why clobbering ebp should have any effect on a subsequent
program since AFAICT, it should eventually be restored correctly when
the (nested) signal handler returns.  However, it does seem to cause

I've modified my test case to deal with this scenario.  It seems
to work fine with the 36 offset.  Another, more expensive, solution
is to use the value saved in sigsave.retaddr_on_stack.  That is
somewhat attractive since you don't have to calculate stack offsets.
So, it will continue to work even if we play around with stuff in
sigdelayed.  However, in the interests of speed, I'll check in the
36 offset version.

I used to have little test cases for this signal handling stuff.  I
deleted them early last year when I was cleaning out my home directory.
It was one of those "What's this 'test' directory thing?" "rm -r test"
(whistle a happy tune) "Hmm.  It's taking a long...  ARGH!" situations.

For the curious, I've included my test case below.


#include <stdio.h>
#include <sys/signal.h>

static int counter = 0;
static int killed = 0;

int killit ()
  return kill (getpid (), SIGHUP);

int ouch (int sig)
  int real_counter = counter & ~0x400000;
  int recursed = real_counter & counter;
  counter = real_counter;
  if ((++counter & 1) || (counter < 10))
    killit ();
  printf ("sig %d, counter %d%s\n", sig, counter, recursed ? ", recursed" : "");
  sig = 27;
  counter |= 0x400000;

int bye (int sig)
  printf ("sig %d, kill counter %d, counter %d\n", sig, killed, counter);
  exit (0);

main (int argc, char **argv)
  signal (SIGHUP, ouch);
  signal (SIGINT, bye);
  while (killit () == 0)
      counter &= ~0x400000;
      puts ("sent");

More information about the Cygwin-developers mailing list