cygwin 3.6.0: No signals received after swapcontext() is used
Takashi Yano
takashi.yano@nifty.ne.jp
Thu Mar 13 11:42:52 GMT 2025
Hi Corinna,
On Thu, 13 Mar 2025 10:40:48 +0100
Christian Franke wrote:
> Corinna Vinschen via Cygwin wrote:
> > On Mar 12 17:06, Corinna Vinschen via Cygwin wrote:
> >> On Mar 12 16:30, Corinna Vinschen via Cygwin wrote:
> >>> On Mar 11 12:32, Christian Franke via Cygwin wrote:
> >>>> The attached testcase should test the following use cases of setcontext:
> >>>> - call from regular user space
> >>>> - call from a signal handler interrupting user space
> >>>> - call from a signal handler interrupting a system call
> >>>>
> >>>> It works as expected ... until the signal count reaches 256. Then signals
> >>>> are again only delivered from inside of a system call.
> >>>> [...]
> >>>> Interesting... Hmm... is there some 8-bit counter which overflows and then
> >>>> stucks at 0xff or 0x00?
> >>> It's a kind of stack overflow. Kind of, because it's not the normal
> >>> thread stack, but a special signal stack in the _cygtls area.
> >>>
> >>> When interrupting a running thread to call a signal handler, the context
> >>> of the thread is changed to restart execution in an assembler function
> >>> called sigdelayed(). The original IP of the thread is pushed on the
> >>> aforementioned signal stack. Sigdelayed() calls the signal handler. On
> >>> return it pops the original IP from the signal stack and continues the
> >>> thread.
> >>>
> >>> Now guess what happens if the signal handler bails out with longjmp or
> >>> setcontext/swapcontext.
> >>>
> >>> The signal handler never returns to the sigdelayed() function, the
> >>> original address is never poped from the signal stack, and the signal
> >>> stack has a max. size of 256 address entries...
> >>>
> >>> Theoretically, a small update to sigdelayed() would fix the issue: ather
> >>> then poing the original IP from the signal stack after calling the
> >>> handler, it should pop the IP prior to calling the handler. That would
> >>> avoid filling up the signal stack when long-jumping out of the signal
> >>> handler. It should store the IP in one of the callee-saved registers.
> >>> %r13 is unused in sigdelayed so far.
> >>>
> >>> However, even if we do this, there's still the problem that sigdelayed()
> >>> itself takes space on the stack. If you longjmp/setcontext out of the
> >>> handler, the thread's normal stack will fill up with dead storage of the
> >>> sigdelayed() function, and there's no way out of this trap. We can't
> >>> restore the stack before the handler returns.
> >>>
> >>> So either way, at one point you get a stack overflow one way or the
> >>> other.
> >>>
> >>> The signal stack overflow is actually rather harmless in comparison
> >>> to a real stack overflow.
> >>>
> >>> If you have any idea how to avoid the real stack overflow, I'd be
> >>> all ears.
> >> Looks like this isn't really a problem with setcontext. It always
> >> corrects the stack pointer as well. Apparently I haven't thought
> >> long enough about this.
> >>
> >> I have a patch for sigdelayed() in the loop, stay tuned.
> > Just pushed. Try cygwin-3.6.0-0.430.ga942476236b5 in a bit.
>
> Problem does no longer occur. Also tested with 'kill -INT PID && sleep
> 0.01' in a loop.
After the commit:
commit a942476236b5e39bf30c533d08df7392e326a4c6 (origin/master, origin/main, origin/HEAD)
Author: Corinna Vinschen <corinna@vinschen.de>
Date: Wed Mar 12 17:17:31 2025 +0100
Cygwin: sigdelayed: pop return address from signal stack earlier
Christians test case: timersig.c no longer works even with my v3 patches.
I suspect it is because pop(), retaddr() are not working as intended in
call_signal_handler() with this commit.
Could you please have a look?
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin
mailing list