ssh-agent doesn't die
Ken Brown
kbrown@cornell.edu
Fri Oct 4 14:27:00 GMT 2019
On 9/29/2019 4:05 PM, Ken Brown wrote:
> On 9/27/2019 10:12 AM, Ken Brown wrote:
>> On 9/27/2019 9:37 AM, Norton Allen wrote:
>>> On 9/26/2019 10:50 PM, Ken Brown wrote:
>>>>
>>>>> As a simple test example, consider:
>>>>>
>>>>> /bin/ssh-agent /bin/sleep 10
>>>>>
>>>>> While the sleep is still running, ps shows:
>>>>>
>>>>> PID PPID PGID WINPID TTY UID STIME COMMAND
>>>>> 1694 1693 1694 1576 ? 22534 00:01:10
>>>>> /usr/bin/ssh-agent
>>>>> 1653 1 1653 11740 cons1 22534 00:00:37 /usr/bin/bash
>>>>> 1693 1653 1693 1552 cons1 22534 00:01:10 /usr/bin/sleep
>>>>>
>>>>> One oddity is that ssh-agent is listed as a subprocess of sleep
>>>> ...but this isn't a bug. ssh-agent forks, and then the parent execs the command.
>>>
>>> With the salient difference presumably being that the exec is done in the parent
>>> instead of the child as usual?
>>
>> Yes. The idea is that 'ssh-agent command' should be more-or-less equivalent to
>> running 'command', with ssh-agent running as a subprocess.
>>
>> The ssh-agent subprocess periodically checks to see if its parent is still
>> alive, and it exits when the parent has died. Someone should figure out why
>> this is not working on Cygwin.
>
> As an aid to someone who might want to debug this (probably Corinna when she
> returns), I've created a test program agent.c (attached) that simulates the
> relevant part of ssh-agent:
>
> 1. It forks a subprocess that periodically checks to see if its parent has died,
> and then exits.
>
> 2. The parent execs "/usr/bin/sleep 1".
>
> As with ssh-agent, the subprocess never detects that the parent has died, and so
> it never exits.
>
> Running this program under strace shows the following error in the pinfo
> constructor:
>
> pinfo::pinfo: couldn't duplicate parent rd_proc_pipe handle 0x1BC for forked
> child 1666 after exec, Win32 error 5
>
> [Win32 error 5 is ERROR_ACCESS_DENIED.]
It seems that the pinfo constructor failure happens in
cygheap_exec_info::reattach_children(). The latter is preceded by the following
comment:
/* Reattach non-reaped subprocesses passed in from the cygwin process
which previously operated under this pid. FIXME: Is there a race here
if the process exits during cygwin's exec handoff? */
I tried running my test program under gdb with a breakpoint at
reattach_children, and the breakpoint was never hit. That gives an affirmative
answer to the question in the FIXME.
As a result, the exec'd program never becomes aware that it has a subprocess, so
it exits without resetting the subprocess's ppid to 1.
Is there someone out there familiar enough with Cygwin's exec to suggest a fix?
It would be a nice gift to Corinna to get this fixed before her return.
Ken
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list