SIGKILL may no longer work after many SIGCONT/SIGSTOP signals
Takashi Yano
takashi.yano@nifty.ne.jp
Wed Nov 20 13:43:08 GMT 2024
On Tue, 19 Nov 2024 18:21:52 +0900
Takashi Yano wrote:
> On Tue, 12 Nov 2024 10:53:58 +0100
> Christian Franke wrote:
> > Found with 'stress-ng --cpu-sched' from current stress-ng upstream HEAD:
> >
> > Testcase (attached):
> >
> > $ gcc -O2 -o manysignals manysignals.c
> >
> > $ ./manysignals
> > fork() = 1833
> > ...
> > fork() = 1848
> > ...
> > kill(1833, 17)
> > ...
> > kill(1848, 17)
> > kill(1833, 9)
> > ...
> > kill(1848, 9)
> > waitpid(1833, ., 0)
> >
> >
> > Run this in second terminal:
> >
> > $ watch "ps | sed -n '1p;/manysignals/{/sed/d;p}'"
> >
> > If 'S' appear in the first column, the child processes likely reached
> > the final SIGSTOP state. This takes some time. The parent process may
> > still hang in first waitpid() but should not.
> >
> > If the parent process is aborted with ^C, child processes may be stopped
> > or left behind. Occasionally a child process that can not be stopped by
> > Cygwin (kill -9) is left behind.
> >
> > Tested with ancient (i7-2600K) and more recent (i7-14700K) CPU :-)
> >
> >
> > Unrelated to the above, but related to 'stress-ng --cpu-sched' which
> > uses sched_get/setscheduler():
> >
> > - sched_getscheduler() always returns SCHED_FIFO. As far as I understand
> > Linux sched(7), this is a non-preemptive real-time policy. The
> > preemptive SCHED_RR would possibly a more reasonable value.
> > Unfortunately SCHED_OTHER cannot be used because it would require to
> > ignore the priority.
> >
> > - sched_setscheduler() always fails with ENOSYS. It IMO should allow to
> > set 'param->sched_priority' if 'policy' is equal to the value returned
> > by sched_getscheduler().
>
> Thanks for the report and the test case. I'm now looking into
> the issue. Please wait a while.
Hopefully, I have found the cause.
The deadlock happens between main thread and wait_sig thread.
The main thread is waiting for the wait_sig thread triggering
wakeup event while the wait_sig thread is waiting previous
signal being processed by main thread.
Let me consider how to fix that.
--
Takashi Yano <takashi.yano@nifty.ne.jp>
More information about the Cygwin
mailing list