This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA, doc RFA] Avoid calling gdb_realpath if basenames are different


> Date: Thu, 10 Nov 2011 15:58:46 -0800
> From: Doug Evans <dje@google.com>
> 
> 2011-11-10  Doug Evans  <dje@google.com>
> 
>         * NEWS: Mention new parameter basenames-may-differ.
>         * dwarf2read.c (dw2_lookup_symtab): Avoid calling gdb_realpath if
>         ! basenames_may_differ.
>         * psymtab.c (lookup_partial_symtab): Ditto.
>         * symtab.c (lookup_symtab): Ditto.
>         (basenames_may_differ): New global.
>         (_initialize_symtab): New parameter basenames-may-differ.
>         * symtab.h (basenames_may_differ): Declare.
> 
>         doc/
>         * gdb.texinfo (Files): Document basenames-may-differ.

Thanks.

> +set basenames-may-differ
> +show basenames-may-differ
> +  Set whether a source file may have multiple base names.
> +  A "base name" is the name of a file with the directory part removed.
> +  Example: The base name of "/home/user/hello.c" is "hello.c".
> +  When doing file name based lookups, gdb will canonicalize file names
> +  (e.g., expand symlinks) before comparing them, which is an expensive
> +  operation.
> +  If set, gdb will not assume a file is known by one base name, and thus
> +  it cannot optimize file name comparisions by skipping the canonicalization
> +  step if the base names are different.
> +  If not set, all source files must be known by one base name,
> +  and gdb will do file name comparisons more efficiently.

I suggest to rearrange the text, so as to put together the parts that
describe what happens when the option is set.  Like this:

  Set whether a source file may have multiple base names.
  (A "base name" is the name of a file with the directory part removed.
  Example: The base name of "/home/user/hello.c" is "hello.c".)
  If set, GDB will canonicalize file names (e.g., expand symlinks)
  before comparing them.  Canonicalization is an expensive operation,
  but it allows the same file be known by more than one base name.
  If not set (the default), all source files are assumed to have just
  one base name, and gdb will do file name comparisons more efficiently.

OK?

> +When processing file names provided by the user,
> +@value{GDBN} will canonicalize them and remove symbolic links.
> +This ensures that @value{GDBN} will find the right file,
> +even if the debug information specifies an alternate path.
> +However, with large programs this canonicalization can noticeably slow
> +down @value{GDBN}.  To compensate, @value{GDBN} will try to avoid
> +this canonicalization wherever possible.  One way it can do so
> +is by first comparing the @samp{base name} of a file.
> +The @samp{base name} of a file is simply the file's name without
> +any directory information.  For example, the base name of
> +@file{/home/user/hello.c} is @file{hello.c}.
> +By doing this @value{GDBN} can skip, for example,
> +@file{/usr/include/stdio.h} without having to first canonicalize
> +and then compare the directory names.
> +This works great, except when the base name of a file
> +can have multiple names due to symbolic links.
> +For example, if @file{/home/user/bar.c} is a symbolic link to
> +@file{/home/user/foo.c} then @value{GDBN} cannot just look at
> +the base name of two files, it must canonicalize them, expand
> +all symbolic links, and @emph{then} compare the file names
> +to see if they match.
> +Fortunately, having one file known by two different base names
> +does not generally occur in practice.
> +Should it occur, however, @value{GDBN} provides an escape hatch
> +to allow this to work.
> +By setting @code{basenames-may-differ} to @code{true}
> +@value{GDBN} will always canonicalize file names before
> +comparing them, thus ensuring that one file known by multiple
> +base names are treated as the same file.

This is written as mostly an apology for having this option.  That is
a wrong angle for describing features in a user manual, because the
user generally trusts the developers by default to DTRT.  So I would
reword it

  When processing file names provided by the user, @value{GDBN}
  frequently needs to compare them to the file names recorded in the
  program's debug info.  Normally, @value{GDBN} compares just the
  @dfn{base names} of the files as strings, which is reasonably fast
  even for very large programs.  (The base name of a file is the last
  portion of its name, after stripping all the leading directories.)
  This shortcut in comparison is based upon the assumption that files
  cannot have more than one base name.  This is usually true, but
  references to files that use symlinks or similar filesystem
  facilities violate that assumption.  If your program records files
  using such facilities, or if you provide file names to @value{GDBN}
  using symlinks etc., you can set @code{basenames-may-differ} to
  @code{true} to instruct @value{GDBN} to completely canonicalize each
  pair of file names it needs to compare.  This will make file-name
  comparisons accurate, but at a price of a significant slowdown.

Do you agree with this wording?


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]