Protect fork() against dll- and exe-updates.

Corinna Vinschen corinna-cygwin@cygwin.com
Tue Dec 8 21:48:00 GMT 2015


Hi Michael,


sorry for the delay, but I have some other duties besides Cygwin and the
W10 1511 problem has thrown me back even more, so I only now find the
time to review your patch.

Subterfuge aside, I have a few questions and comments, see below.
Plesae note that I didn't apply the patch so far but just inspected it.

On Nov 16 19:07, Michael Haubenwallner wrote:
> On 11/16/2015 11:22 AM, Corinna Vinschen wrote:
> > I'm not yet clear on how this is going to handle binaries
> > installed to another drive...
> 
> For now, when NtSetInformationFile (FileLinkInformation) does fail
> for whatever reason (I cannot think of anything else than another
> drive), this file is not hardlinked and will be used from its original
> location. Must admit I haven't tested this scenario.
> 
> >> either as hardlink, copy, or symlink(?).
> 
> However, I can imagine two different cross-drive approaches:
> * Either create some cygfork-directory on each drive to hold the
>   hardlinks (still requires NTFS), and put symlinks into /var/run/
> * or do a file-copy using the pre-opened handle into /var/run/
>   (should work even with FAT, but would be slow).

Indeed.  Let's stick to the DLLs from the distro which are not supposed
to be on another drive.  /var/run and hardlinks should do.  This
functionality won't be supported on FAT and I don't see a reason to.

> BTW: MSDN document for FILE_INTERNAL_INFORMATION structure tells about
> IndexNumber as the member - and so does /usr/include/w32api/, but
> winsup/cygwin/ntdll.h does name it FileId - caused me some confusion...
> Is that intentionally?

Fixed now.

First off, before I dive into the patch:

Would you mind terribly to split this patch into a patchset so we
get a set of smaller patches as far as it makes sense?  I'm a bit
reluctant to apply such a big patch in one go.  I did this myself
a lot back in the pre-git CVS times, but the longer I work with git
the more I appreciate patches split into sets of smaller patches.
They are easier to review and *much* easier to handle when bisecting
the code in case of searching for a bug.

Another thing occured to me:  Doesn't using this stuff break the output
of /proc/<PID>/maps?

> +/* Use this buffer under loader lock conditions only. */
> +PWCHAR
> +dll::nt_max_path_buf ()
> +{
> +  static WCHAR NO_COPY buf[NT_MAX_PATH];
> +  return buf;
> +}

Shouldn't buf (with different name then perhaps) be a private static
member of dll_list instead?  dll::nt_max_path_buf() could become an
inline function then.

> +/* Return *nameptr probably prefixed with "\\??\\", but
> +   update *nameptr to stay without the "\\??\\" prefix. */
> +PWCHAR
> +dll::to_ntname (PWCHAR *nameptr, WCHAR ntnamebuf[NT_MAX_PATH])

Why does dll::to_ntname need a second parameter if it's always called
on the buffer returned by dll::nt_max_path_buf?

> +{
> +  /* avoid using path_conv here: cygheap might not be
> +     initialized when started from non-cygwin process,
> +     or still might be frozen in_forkee */
> +  PWCHAR ntname = *nameptr;
> +  if ((*nameptr)[1] == L':')

What if Cygwin has been installed on a network drive with no drive
letter attached?  In that case you'd have to deal with UNC paths,
but they are ignored here.

> +    {
> +      ntname = ntnamebuf;
> +      memmove (ntname + 4, *nameptr,
> +	sizeof (*nameptr) * min (NT_MAX_PATH - 4, wcslen (*nameptr) + 1));
> +      wcsncpy (ntname, L"\\??\\", 4);
> +      ntname[NT_MAX_PATH - 1] = L'\0';
> +      *nameptr = ntname + 4;
> +    }
> +  else
> +  if (!wcsncmp (*nameptr, L"\\??\\", 4))

Small style nit:  if else in a single line, please.

> +    *nameptr += 4;
> +  return ntname;
> +}
> + [...]
> +HANDLE
> +dll::ntopenfile (PWCHAR ntname, NTSTATUS *pstatus, ULONG openopts)
> +{
> +  NTSTATUS status;
> +  if (!pstatus)
> +    pstatus = &status;
> +
> +  UNICODE_STRING fn;
> +  RtlInitUnicodeString (&fn, ntname);
> +
> +  OBJECT_ATTRIBUTES oa;
> +  InitializeObjectAttributes (&oa, &fn, 0, NULL, NULL);
> +
> +  ACCESS_MASK access = FILE_READ_ATTRIBUTES;
> +  if (openopts & FILE_DELETE_ON_CLOSE)
> +    access |= DELETE;
> +  if (openopts & FILE_DIRECTORY_FILE)
> +    access |= FILE_LIST_DIRECTORY;
> +
> +  access |= SYNCHRONIZE;
> +  openopts |= FILE_SYNCHRONOUS_IO_NONALERT;
> +
> +  HANDLE fh = NULL;
> +  ULONG share = FILE_SHARE_VALID_FLAGS;
> +  IO_STATUS_BLOCK iosb;
> +  *pstatus = NtOpenFile (&fh, access, &oa, &iosb, share, openopts);
> +  debug_printf ("%y = NtOpenFile (%p, a %xh, sh %xh, o %xh, io %y, '%W')",
> +      *pstatus, fh, access, share, openopts, iosb.Status, fn.Buffer);
> +
> +  return fh;

Am I too paranoid?  AFAICS, NtOpenFile doesn't give a guarantee that
the handle stays unchanged if the function fails.  Wouldn't be something
like 

     return NT_SUCCESS (*pstatus) ? fh : NULL;

be safer here?

>    else
>      {
> +      size_t forknamesize = forkable_namesize (type, name, modname);
> +
>        /* FIXME: Change this to new at some point. */
> -      d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d) + (namelen * sizeof (*name)));
> +      d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d) +
> +	    ((namelen + 1 + forknamesize) * sizeof (*name)));

Minor style:

      d = (dll *) cmalloc (HEAP_2_DLL, sizeof (*d)
			   + ((namelen + 1 + forknamesize) * sizeof (*name)));

> @@ -236,13 +326,24 @@ dll_list::alloc (HINSTANCE h, per_process *p, dll_type type)
>        wcscpy (d->name, name);
>        d->modname = d->name + (modname - name);
>        d->handle = h;
> -      d->has_dtors = true;
> +      d->has_dtors = type > DLL_SELF;

A small comment at this point could be helpful in the long run.

>        d->p = p;
>        d->ndeps = 0;
>        d->deps = NULL;
>        d->image_size = ((pefile*)h)->optional_hdr ()->SizeOfImage;
>        d->preferred_base = (void*) ((pefile*)h)->optional_hdr()->ImageBase;
>        d->type = type;
> +      d->forkable_name = d->name + namelen + 1;
> +      *d->forkable_name = L'\0';
> +      d->fbi.FileAttributes = INVALID_FILE_ATTRIBUTES;
> +      LARGE_INTEGER zero = { 0 };
> +      d->fii.FileId = zero;

Oops, sorry, FileId -> NameIndex :}

> +      PWCHAR ntname = dll::to_ntname (&name, dll::nt_max_path_buf ());
> +      NTSTATUS status;
> +      d->fhandle = dll::ntopenfile (ntname, &status);
> +      if (!d->fhandle)
> +	system_printf ("Unable (ntstatus %y) to open file %W",
> +		       status, ntname);
>        append (d);
>        if (type == DLL_LOAD)
>  	loaded_dlls++;
> @@ -396,7 +497,7 @@ dll_list::detach (void *retaddr)
>    if (!myself || in_forkee)
>      return;
>    guard (true);
> -  if ((d = find (retaddr)))
> +  if ((d = find (retaddr)) && d->type != DLL_SELF)

dll_list::find is only ever used to find dlls with d->type != DLL_SELF.
Wouldn't it make sense to move this check into the method?

> diff --git a/winsup/cygwin/dll_init.h b/winsup/cygwin/dll_init.h
> index 8127d0b..12344de 100644
> --- a/winsup/cygwin/dll_init.h
> +++ b/winsup/cygwin/dll_init.h
> @@ -43,6 +43,7 @@ struct per_module
>  typedef enum
>  {
>    DLL_NONE,
> +  DLL_SELF, /* main-program.exe, cygwin1.dll */
>    DLL_LINK,
>    DLL_LOAD,
>    DLL_ANY
> @@ -61,7 +62,13 @@ struct dll
>    DWORD image_size;
>    void* preferred_base;
>    PWCHAR modname;
> -  WCHAR name[1];
> +  /* forkable { */

Please drop these bracketing comments, especially the braces in them.
If you want to emphasize them, just use an introductory comment and
an empty line at the end of the block.

> diff --git a/winsup/cygwin/fork.cc b/winsup/cygwin/fork.cc
> index 084f8f0..dd0f9a6 100644
> --- a/winsup/cygwin/fork.cc
> +++ b/winsup/cygwin/fork.cc
> @@ -356,8 +351,19 @@ frok::parent (volatile char * volatile stack_here)
>  
>    while (1)
>      {
> +      dlls.request_forkables ();

I'm aware that hold_everything has been called but, is that safe?
request_forkables calls update_forkables_needs which in turn uses the
static dll::nt_max_path_buf buffer...

> +/* Mangle full srcpath as one single filename into targetbuf,
> +   optionally telling the last path separator to allow for
> +   restoring the file's basename.
> +   Return value is the number of characters mangled. */
> +static int
> +mangle_as_filename (PWCHAR targetbuf, PCWCHAR srcpath, PWCHAR *lastpathsep)
> +{
> +  PWCHAR target = targetbuf;
> +  if (lastpathsep)
> +    *lastpathsep = NULL;
> +
> +  for (; *srcpath; ++srcpath)
> +    switch (*srcpath)
> +      {
> +	case L'\\':
> +	  if (lastpathsep)
> +	    *lastpathsep = target;
> +	  *(target++) = L',';
> +	  break;
> +	case L'?':
> +	case L':':
> +	  *(target++) = L'_';

This parens are not required.  Please remove them.

> +	  break;
> +	default:
> +	  *(target++) = *srcpath;
> +	  break;
> +      }
> +  return target - targetbuf;
> +}

So what happens with a path where the parent dir path is > NAME_MAX?
If I didn't miss something, this won't work for very long paths.

> +/* Create the lastsepcount directories found in dirname, where
> +   counting is done along path separators (including trailing ones).
> +   Returns true when these directories exist afterwards, false otherways.
> +   The dirname is used for the path-splitting. */
> +static
> +bool mkdirs (PWCHAR dirname, SECURITY_ATTRIBUTES *psec, int lastsepcount)
> +{
> +  bool success = true;
> +  int i = lastsepcount;
> +  for (--i; i > 0; --i)
> +    {
> +      PWCHAR lastsep = wcsrchr (dirname, L'\\');
> +      if (!lastsep)
> +        break;
> +      *lastsep = L'\0';
> +    }
> +
> +  for (++i; i <= lastsepcount; ++i)
> +    {
> +      if (success && (i == 0 || wcslen (wcsrchr (dirname, L'\\')) > 1))
> +	{
> +	  BOOL ret = CreateDirectoryW (dirname, psec);

NtCreateFile?

> +	  if (!ret && GetLastError () != ERROR_ALREADY_EXISTS)
> +	    success = false;
> +	  debug_printf ("%d = CreateDirectoryW (%W) %E", ret, dirname);
> +	}
> +      if (i < lastsepcount)
> +	dirname[wcslen (dirname)] = L'\\'; /* restore original value */
> +    }
> +  return success;
> +}
> [...]
> +/* Nominate the hardlink to an individual DLL inside dirx_name,
> +   that ends with the path separator (hence the "x" varname).
> +   With NULL as dirx_name, never nominate the hardlink any more.
> +   With "" as dirx_name, denominate the hardlink. */
> +void
> +dll::nominate_forkable (PCWCHAR dirx_name)
> +{
> +  if (!dirx_name)
> +    {
> +      debug_printf ("type %d disable %W", type, name);
> +      forkable_name = NULL; /* never create a hardlink for this dll */
> +    }
> +
> +  if (!forkable_name)
> +    return;
> +
> +  wcscpy (forkable_name, dirx_name);

Suggestion: Use

     PWCHAR p = wcpcpy (forkable_name, dirx_name);
     
You can use p later on...

> +
> +  if (!*forkable_name)
> +    return; /* denominate */
> +
> +  if (type < DLL_LOAD)
> +    wcscat (forkable_name, modname);

...here as in

       wcpcpy (p, modname);

> +  else
> +    {
> +      /* Avoid lots of extra directories for loaded dll's:
> +       * mangle full path into one single directory name,
> +       * just keep original filename intact. The original
> +       * filename is necessary to serve as linked
> +       * dependencies of dynamically loaded dlls. */
> +      PWCHAR lastpathsep;
> +      mangle_as_filename (forkable_name + wcslen (forkable_name), name,
> +			  &lastpathsep);

...and here as in

         mangle_as_filename (p, name, &lastpathsep);

This avoids calling wcslen twice.

> +      if (lastpathsep)
> +	*lastpathsep = L'\\';
> +    }
> +}
> +
> +/* Create the nominated hardlink for one indivitual dll,
> +   inside another subdirectory when dynamically loaded. */
> +bool
> +dll::create_forkable ()
> +{
> +  if (!forkable_name || !*forkable_name)
> +    return true; /* disabled */
> +
> +  if (!fhandle)
> +    return false;
> +
> +  PWCHAR last = NULL;
> +  bool success = true;
> +  if (type >= DLL_LOAD)
> +    {
> +      last = wcsrchr (forkable_name, L'\\');
> +      if (!last)
> +        return false;
> +      *last = L'\0';
> +      success = mkdirs (forkable_name, &sec_none_nih, 1);

What's the uppermost directory which will be created here?  Is it safe to
use sec_none_nih for all of them?

> +/* Into buf if not NULL, write the ntname of cygwin installation_root.
> +   Return the number of characters (that would be) written. */
> +int
> +dll_list::rootname (PWCHAR buf)
> +{
> +  PWCHAR cygroot = cygheap->installation_root;
> +  if (!buf)
> +    return wcslen (cygroot);
> +
> +  dll::to_ntname (&cygroot, dll::nt_max_path_buf ());
> +  wcscpy (buf, cygroot);
> +  return wcslen (buf);

     return wcpcpy (buf, cygroot) - buf;

Alternatively we could add a cygheap->installation_root_len member.

> +}
> +
> +/* Into buf if not NULL, write the string representation of current user sid.
> +   Return the number of characters (that would be) written. */
> +int
> +dll_list::sidname (PWCHAR buf)
> +{
> +  if (!buf)
> +    return 128;
> +
> +  UNICODE_STRING sid;
> +  WCHAR sidbuf[128+1];
> +  RtlInitEmptyUnicodeString (&sid, sidbuf, sizeof (sidbuf));
> +  RtlConvertSidToUnicodeString (&sid, cygheap->user.sid (), FALSE);
> +  wcscpy (buf, sid.Buffer);
> +  return wcslen (buf);

See above.

> +int
> +dll_list::ctimename (PWCHAR buf)
> +{
> +  if (!buf)
> +    return 16;
> +
> +  LARGE_INTEGER zero = { 0 };
> +  LARGE_INTEGER newest = { 0 };
> +  /* Need by-handle-file-information for _all_ loaded dlls,
> +     as most recent ctime forms the hardlinks directory. */
> +  dll *d = &dlls.start;
> +  while ((d = d->next))
> +    {
> +      if (!d->fhandle)
> +        continue;
> +      if (d->fbi.FileAttributes == INVALID_FILE_ATTRIBUTES)
> +	{
> +	  /* Relevant FILE_BASIC_INFORMATION members are
> +	     valid in child when cygheap-copied from parent,
> +	     so caching them is fine.
> +	     The FILE_BASIC_INFORMATION queried here also is
> +	     used in dll_list::update_forkables_needs below. */
> +	  NTSTATUS status;
> +	  IO_STATUS_BLOCK iosb;
> +	  status = NtQueryInformationFile (d->fhandle, &iosb,
> +					   &d->fbi, sizeof (d->fbi),
> +					   FileBasicInformation);
> +	  if (!NT_SUCCESS (status))
> +	    {
> +	      system_printf ("WARNING: %y = NtQueryInformationFile (%p,"
> +			     " BasicInfo, io.Status %y, %W)",
> +			     status, d->fhandle, iosb.Status, d->name);
> +	      d->fbi.FileAttributes = 0;
> +	      d->fbi.LastWriteTime = zero;
> +	    }
> +	}
> +
> +      if (d->fbi.LastWriteTime.QuadPart > newest.QuadPart)
> +	newest = d->fbi.LastWriteTime;

LastWriteTime?  This is supposed to check if a newer DLL replaced
an older in-use one, so shouldn't that be CreationTime?

> +/* Create the nominated forkable hardlinks and directories as necessary,
> +   as well as the .local file for dll-redirection. */
> +bool
> +dll_list::create_forkables ()
> +{
> +  bool success = true;
> +
> +  int lastsepcount = 1; /* we have trailing pathsep */
> +  for (namepart const *part = forkable_nameparts; part->text; ++part)
> +    if (part->create_dir)
> +      ++lastsepcount;
> +
> +  PWCHAR buf = dll::nt_max_path_buf ();
> +  wcsncpy (buf, forkables_dirx_name, NT_MAX_PATH);
> +
> +  if (!mkdirs (buf, &sec_none_nih, lastsepcount))

Again, sec_none_nih for the entire directory tree?

> +    success = false;
> +
> +  if (success)
> +    {
> +      /* create the DotLocal file as empty file */
> +      wcsncat (buf, main_executable->modname, NT_MAX_PATH);
> +      wcsncat (buf, L".local", NT_MAX_PATH);
> +      HANDLE hlocal = CreateFileW (buf, GENERIC_WRITE,
> +				   FILE_SHARE_READ, &sec_none_nih,
> +				   OPEN_ALWAYS, 0, NULL);

Please use NtCreateFile.  We're using CreateFile only for the console
and in one instance for pipes.


What's missing is user documentation for this feature.  A bit of
description what happens, what has to be done by the user, and what
limitations it has would be helpful.


Thanks,
Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Maintainer                 cygwin AT cygwin DOT com
Red Hat
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 819 bytes
Desc: not available
URL: <http://cygwin.com/pipermail/cygwin-developers/attachments/20151208/f0b15fe6/attachment.sig>


More information about the Cygwin-developers mailing list