This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: ssh-agent doesn't die


On 9/29/2019 4:05 PM, Ken Brown wrote:
> On 9/27/2019 10:12 AM, Ken Brown wrote:
>> On 9/27/2019 9:37 AM, Norton Allen wrote:
>>> On 9/26/2019 10:50 PM, Ken Brown wrote:
>>>>
>>>>> As a simple test example, consider:
>>>>>
>>>>> /bin/ssh-agent /bin/sleep 10
>>>>>
>>>>> While the sleep is still running, ps shows:
>>>>>
>>>>>           PID    PPID    PGID     WINPID   TTY         UID    STIME COMMAND
>>>>>          1694    1693    1694       1576  ?          22534 00:01:10
>>>>> /usr/bin/ssh-agent
>>>>>          1653       1    1653      11740  cons1      22534 00:00:37 /usr/bin/bash
>>>>>          1693    1653    1693       1552  cons1      22534 00:01:10 /usr/bin/sleep
>>>>>
>>>>> One oddity is that ssh-agent is listed as a subprocess of sleep
>>>> ...but this isn't a bug.  ssh-agent forks, and then the parent execs the command.
>>>
>>> With the salient difference presumably being that the exec is done in the parent
>>> instead of the child as usual?
>>
>> Yes.  The idea is that 'ssh-agent command' should be more-or-less equivalent to
>> running 'command', with ssh-agent running as a subprocess.
>>
>> The ssh-agent subprocess periodically checks to see if its parent is still
>> alive, and it exits when the parent has died.  Someone should figure out why
>> this is not working on Cygwin.
> 
> As an aid to someone who might want to debug this (probably Corinna when she
> returns), I've created a test program agent.c (attached) that simulates the
> relevant part of ssh-agent:
> 
> 1. It forks a subprocess that periodically checks to see if its parent has died,
> and then exits.
> 
> 2. The parent execs "/usr/bin/sleep 1".
> 
> As with ssh-agent, the subprocess never detects that the parent has died, and so
> it never exits.
> 
> Running this program under strace shows the following error in the pinfo
> constructor:
> 
> pinfo::pinfo: couldn't duplicate parent rd_proc_pipe handle 0x1BC for forked
> child 1666 after exec, Win32 error 5
> 
> [Win32 error 5 is ERROR_ACCESS_DENIED.]

It seems that the pinfo constructor failure happens in 
cygheap_exec_info::reattach_children().  The latter is preceded by the following 
comment:

/* Reattach non-reaped subprocesses passed in from the cygwin process
    which previously operated under this pid.  FIXME: Is there a race here
    if the process exits during cygwin's exec handoff?  */

I tried running my test program under gdb with a breakpoint at 
reattach_children, and the breakpoint was never hit.  That gives an affirmative 
answer to the question in the FIXME.

As a result, the exec'd program never becomes aware that it has a subprocess, so 
it exits without resetting the subprocess's ppid to 1.

Is there someone out there familiar enough with Cygwin's exec to suggest a fix? 
It would be a nice gift to Corinna to get this fixed before her return.

Ken

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]