This is the mail archive of the
cygwin
mailing list for the Cygwin project.
Re: ssh-agent doesn't die
- From: Ken Brown <kbrown at cornell dot edu>
- To: "cygwin at cygwin dot com" <cygwin at cygwin dot com>
- Date: Fri, 4 Oct 2019 14:27:00 +0000
- Subject: Re: ssh-agent doesn't die
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=cornell.edu; dmarc=pass action=none header.from=cornell.edu; dkim=pass header.d=cornell.edu; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=U1omgcApPEkMTJ1M0XzWu/sa6HcBpHtjz+yh19OCkek=; b=TsnPuhlugkIMh8MZMEBVGuZExdZ9541nBRis0/qk5+g2+Qwo7y64qeGydU9NXEB4kQ4BZoPSzkGsRbuSVTTvxYyfhw73062IeTtcSbxErirQy+0P3PlC7stYI9I2yQdWIberJHfgi65+KIo5ZklEM3SD5A20CitgH3v4gRN2dpmtqUfkRNxhjs6D+OduL0OM0H5LLY4pVGkzgFjdHFt8ilGMgPwzIIBc1SPiTD0mkpPsBX4CJRxkSjaDNCFDjwdj/hyKkANGN2QacS9DZZOTfhd621m+qj8fEMB30LYNC6ZkIaZlFgQtgIy9TEtSTkMXJ6nkTH8dVE3mUJcZRFP4Mw==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=OyZZ1jG33uGx/kd36/3xWuSzfShcdNWkP3LrRhdLrHmiYpVyR2de/iKwh5c7PcinjSECMmd0gq9CdQQ8tyVZ1SmYn3R2X8ZjU6RfXhZmJOSK/rbZV1FMZ+LpPBdLZp96r0MiQZXuYMaT0eDCgATEC/DrNiPUSWxdNBvf2o6xhomQXmtkWcAaCSp1QVNv6rnNNFaBiU2Ks7eRxghed7kBgaxAHiM85MVWFI+1DOii4sNQlK1RHDXwv/Ys8i8S4+Ac1EzycfEWmhay3g8B2gywjlBbE14pkavt+9T8WhIUjBU++5Qpxoih7tTyfLYSs+ulLvixwAxVJtj9v5OI8ntbng==
- References: <319e26c0-32f0-40b7-2137-c7de170a3486@rl.ac.uk> <5b225738-c2d7-fbfe-48a7-8c2a38c3398c@cornell.edu> <0ccd17b6-c22a-1a18-9409-1ebcfae60868@huarp.harvard.edu> <9f7a14bb-f81b-c566-bb84-8da7fc6d0fad@cornell.edu> <185c5774-dd8b-5488-b818-4cec5a24bf2d@cornell.edu>
On 9/29/2019 4:05 PM, Ken Brown wrote:
> On 9/27/2019 10:12 AM, Ken Brown wrote:
>> On 9/27/2019 9:37 AM, Norton Allen wrote:
>>> On 9/26/2019 10:50 PM, Ken Brown wrote:
>>>>
>>>>> As a simple test example, consider:
>>>>>
>>>>> /bin/ssh-agent /bin/sleep 10
>>>>>
>>>>> While the sleep is still running, ps shows:
>>>>>
>>>>> PID PPID PGID WINPID TTY UID STIME COMMAND
>>>>> 1694 1693 1694 1576 ? 22534 00:01:10
>>>>> /usr/bin/ssh-agent
>>>>> 1653 1 1653 11740 cons1 22534 00:00:37 /usr/bin/bash
>>>>> 1693 1653 1693 1552 cons1 22534 00:01:10 /usr/bin/sleep
>>>>>
>>>>> One oddity is that ssh-agent is listed as a subprocess of sleep
>>>> ...but this isn't a bug. ssh-agent forks, and then the parent execs the command.
>>>
>>> With the salient difference presumably being that the exec is done in the parent
>>> instead of the child as usual?
>>
>> Yes. The idea is that 'ssh-agent command' should be more-or-less equivalent to
>> running 'command', with ssh-agent running as a subprocess.
>>
>> The ssh-agent subprocess periodically checks to see if its parent is still
>> alive, and it exits when the parent has died. Someone should figure out why
>> this is not working on Cygwin.
>
> As an aid to someone who might want to debug this (probably Corinna when she
> returns), I've created a test program agent.c (attached) that simulates the
> relevant part of ssh-agent:
>
> 1. It forks a subprocess that periodically checks to see if its parent has died,
> and then exits.
>
> 2. The parent execs "/usr/bin/sleep 1".
>
> As with ssh-agent, the subprocess never detects that the parent has died, and so
> it never exits.
>
> Running this program under strace shows the following error in the pinfo
> constructor:
>
> pinfo::pinfo: couldn't duplicate parent rd_proc_pipe handle 0x1BC for forked
> child 1666 after exec, Win32 error 5
>
> [Win32 error 5 is ERROR_ACCESS_DENIED.]
It seems that the pinfo constructor failure happens in
cygheap_exec_info::reattach_children(). The latter is preceded by the following
comment:
/* Reattach non-reaped subprocesses passed in from the cygwin process
which previously operated under this pid. FIXME: Is there a race here
if the process exits during cygwin's exec handoff? */
I tried running my test program under gdb with a breakpoint at
reattach_children, and the breakpoint was never hit. That gives an affirmative
answer to the question in the FIXME.
As a result, the exec'd program never becomes aware that it has a subprocess, so
it exits without resetting the subprocess's ppid to 1.
Is there someone out there familiar enough with Cygwin's exec to suggest a fix?
It would be a nice gift to Corinna to get this fixed before her return.
Ken
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple