This is the mail archive of the
gdb-patches@sourceware.org
mailing list for the GDB project.
Re: [PATCH 7/8] Validate symbol file using build-id.
- From: Aleksandar Ristovski <aristovski at qnx dot com>
- To: Jan Kratochvil <jan dot kratochvil at redhat dot com>
- Cc: <gdb-patches at sourceware dot org>
- Date: Tue, 16 Apr 2013 11:15:00 -0400
- Subject: Re: [PATCH 7/8] Validate symbol file using build-id.
- References: <1365521265-28870-1-git-send-email-ARistovski at qnx dot com> <1365521265-28870-8-git-send-email-ARistovski at qnx dot com> <20130414141740 dot GE23227 at host2 dot jankratochvil dot net>
On 13-04-14 10:17 AM, Jan Kratochvil wrote:
On Tue, 09 Apr 2013 17:27:44 +0200, Aleksandar Ristovski wrote:
[...]
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
[...]
@@ -871,6 +873,109 @@ svr4_keep_data_in_core (CORE_ADDR vaddr, unsigned long size)
return (name_lm >= vaddr && name_lm < vaddr + size);
}
+/* Validate SO by comparing build-id from the associated bfd and
+ corresponding build-id from target memory. */
+
+static int
+svr4_validate (const struct so_list *const so)
+{
+ gdb_byte *build_id;
+ size_t build_idsz;
+
+ gdb_assert (so != NULL);
+
+ if (so->abfd == NULL)
+ return 1;
+
+ if (!bfd_check_format (so->abfd, bfd_object)
+ || bfd_get_flavour (so->abfd) != bfd_target_elf_flavour
+ || elf_tdata (so->abfd)->build_id == NULL)
+ return 1;
+
+ build_id = so->build_id;
+ build_idsz = so->build_idsz;
+
+ if (build_id == NULL)
+ {
+ /* Get build_id from NOTE_GNU_BUILD_ID_NAME section.
+ This is a fallback mechanism for targets that do not
+ implement TARGET_OBJECT_SOLIB_SVR4. */
+
+ const asection *const asec
+ = bfd_get_section_by_name (so->abfd, NOTE_GNU_BUILD_ID_NAME);
+ ULONGEST bfd_sect_size;
+
+ if (asec == NULL)
+ return 1;
+
+ bfd_sect_size = bfd_get_section_size (asec);
+
+ if ((asec->flags & SEC_LOAD) == SEC_LOAD
+ && bfd_sect_size != 0
+ && strcmp (bfd_section_name (asec->bfd, asec),
+ NOTE_GNU_BUILD_ID_NAME) == 0)
+ {
+ const enum bfd_endian byte_order
+ = gdbarch_byte_order (target_gdbarch ());
+ Elf_External_Note *const note = xmalloc (bfd_sect_size);
+ gdb_byte *const note_raw = (void *) note;
+ struct cleanup *cleanups = make_cleanup (xfree, note);
+
+ if (target_read_memory (bfd_get_section_vma (so->abfd, asec)
+ + lm_addr_check (so, so->abfd),
I see it is the most easy one how to get the target VMA of the options we were
discussing.
I do not parse this unambiguously... did you mean this is the easiest
way of getting target VMA? If so, I agree.
+ note_raw, bfd_sect_size) == 0)
+ {
+ build_idsz
+ = extract_unsigned_integer ((gdb_byte *) note->descsz,
+ sizeof (note->descsz),
+ byte_order);
+
+ if (build_idsz == elf_tdata (so->abfd)->build_id->size)
+ {
+ const char gnu[4] = "GNU\0";
+
+ if (memcmp (note->name, gnu, 4) == 0)
+ {
+ ULONGEST namesz
+ = extract_unsigned_integer ((gdb_byte *) note->namesz,
+ sizeof (note->namesz),
+ byte_order);
+ CORE_ADDR build_id_offs;
+
+ /* Rounded to next sizeof (ElfXX_Word). */
+ namesz = ((namesz + (sizeof (note->namesz) - 1))
+ & ~((ULONGEST) (sizeof (note->namesz) - 1)));
Like in the previous patch the rounding should be 4 bytes according to the
standard. Not a sizeof of anything.
Yes, I re-read it, you are right: it is 4 byte aligned. Somehow I
remembered it as Word aligned.
+ build_id_offs = (sizeof (note->namesz)
+ + sizeof (note->descsz)
+ + sizeof (note->type) + namesz);
I find more simple/clear to use offsetof (which is even more correct due to
possible alignments) but I do not mind either way.
Ok, offsetof looks good.
+ build_id = xmalloc (build_idsz);
I would say that build_id should be protected by cleanups but there is
currently no risky function called between here and BUILD_ID comparison below
so I do not mind.
Ok, and yes, this is why I did not put any cleanups (explicit xfree
looked straight forward).
+ memcpy (build_id, note_raw + build_id_offs, build_idsz);
+ }
+ }
+ else
+ warning (_("Malformed %s note\n"), NOTE_GNU_BUILD_ID_NAME);
Again some better identification, vaddr, in fact also PID should be printed in
all these cases.
I think you missed updated patch:
http://sourceware.org/ml/gdb-patches/2013-04/msg00271.html
+
+ }
+ do_cleanups (cleanups);
+ }
+ }
+
+ if (build_id != NULL)
+ {
+ const int match
+ = elf_tdata (so->abfd)->build_id->size == build_idsz
+ && memcmp (build_id, elf_tdata (so->abfd)->build_id->data,
+ elf_tdata (so->abfd)->build_id->size) == 0;
GNU Coding Standards require parentheses on multiple-line expressions:
const int match
= (elf_tdata (so->abfd)->build_id->size == build_idsz
&& memcmp (build_id, elf_tdata (so->abfd)->build_id->data,
elf_tdata (so->abfd)->build_id->size) == 0);
Ok.
Also it would be apropriate to use a temporary variable
struct elf_build_id *somename = elf_tdata (so->abfd)->build_id;
than to repeat the long expression again and again.
Also such variable should be at the top of this function as there is already
accessed 'elf_tdata (so->abfd)->build_id' at the top of this function.
Ok.
+
+ if (build_id != so->build_id)
+ xfree (build_id);
+
+ return match;
+ }
+
+ return 1;
+}
+
/* Implement the "open_symbol_file_object" target_so_ops method.
If no open symbol file, attempt to locate and open the main symbol
@@ -999,6 +1104,9 @@ library_list_start_library (struct gdb_xml_parser *parser,
ULONGEST *lmp = xml_find_attribute (attributes, "lm")->value;
ULONGEST *l_addrp = xml_find_attribute (attributes, "l_addr")->value;
ULONGEST *l_ldp = xml_find_attribute (attributes, "l_ld")->value;
+ const struct gdb_xml_value *const att_build_id
+ = xml_find_attribute (attributes, "build-id");
+ const char *const hex_build_id = att_build_id ? att_build_id->value : NULL;
struct so_list *new_elem;
new_elem = XZALLOC (struct so_list);
@@ -1010,6 +1118,26 @@ library_list_start_library (struct gdb_xml_parser *parser,
strncpy (new_elem->so_name, name, sizeof (new_elem->so_name) - 1);
new_elem->so_name[sizeof (new_elem->so_name) - 1] = 0;
strcpy (new_elem->so_original_name, new_elem->so_name);
+ if (hex_build_id != NULL)
+ {
+ const size_t hex_build_id_len = strlen (hex_build_id);
+
+ if (hex_build_id_len > 0)
+ {
+ new_elem->build_id = xmalloc (hex_build_id_len / 2);
+ new_elem->build_idsz = hex2bin (hex_build_id, new_elem->build_id,
+ hex_build_id_len);
The hex2bin COUNT parameter is in bytes, not hex characters (divide it by 2).
Ok, thanks.
+ if (new_elem->build_idsz != (hex_build_id_len / 2))
Already written before:
Check it the opposite way:
if (2 * new_elem->build_idsz != hex_build_id_len)
as otherwise odd length of the "build-id" attribute string will not be caught
as invalid.
Ok, re-arranged so we only try to convert if hex_build_id_len is even
number.
+ {
+ warning (_("Gdbserver returned invalid hex encoded build_id '%s'"
Use "gdbserver", not capitalized.
+ "(%zu/%zu)\n"),
Already written before:
I would omit this (%zu/%zu) part. Primarily currently %z is not allowed as it
is not compatible with some OSes, there is a plan to import %z printf support
in gdb/gnulib/ but so far there wasn't a need for it.
Besides that you should describe what the two numbers mean otherwise they are
mostly useless. And after all when you already print the XML string the
numbers do not add any more info to it.
THis warning was removed in the updated patch:
http://sourceware.org/ml/gdb-patches/2013-04/msg00271.html
+ hex_build_id, hex_build_id_len, new_elem->build_idsz);
+ xfree (new_elem->build_id);
+ new_elem->build_id = NULL;
+ new_elem->build_idsz = 0;
Already written before:
I am not completely sure warning cannot throw, better to do these cleanups
before the warning call (moreover when %zu/%zu will no longer be printed).
Not applicable to the updated patch:
http://sourceware.org/ml/gdb-patches/2013-04/msg00271.html
...
+ warning (_("Shared object \"%s\" could not be validated "
+ "and will be ignored."), so->so_name);
Maybe the warning should by printed by svr4_validate stating the build-id does
not match (even possibly printing the both build-ids). And then sure one can
remove the warning here.
I was thinking about that, but then decided that svr4_validate should
simply provide an answer, not print warnings. User of the function
should warn if appropriate (as is done in the patch).
This is along the lines of (see below) using build-id for e.g. finding
separate debug info etc...
+ gdb_bfd_unref (so->abfd);
+ so->abfd = NULL;
+ return 0;
+ }
+
if (build_section_table (abfd, &so->sections, &so->sections_end))
{
error (_("Can't find the file sections in `%s': %s"),
@@ -551,6 +562,7 @@ free_so (struct so_list *so)
{
struct target_so_ops *ops = solib_ops (target_gdbarch ());
+ xfree (so->build_id);
free_so_symbols (so);
ops->free_so (so);
I already wrote:
The problem is there is:
xfree (so->build_id);
in free_so() but it should be in free_so_symbols instead. free_so_symbols is
called also from reload_shared_libraries_1 where so_list->abfd can change.
Then obviously one should also set so->build_id = NULL there.
I'm not sure I follow this. In cases where so->build_id is determined it
is good for the lifetime of the 'so', not symbols. It can not change for
the given 'so'.
I was thinking making the BUILD_ID data private so solib-svr4.c but that is
currently not possible, lm_info is private for solib-svr4.c but that has
lifetime of so_list, not the lifetime of so_list->abfd.
I was thinking slightly differently: build-id should be generalized to
represent "an ID". For svr4, it is build-id, for some other system it
might be something else, but if present, should represent the connection
between target binary, binary at hand and associated optional separate
debug info.
Thanks,
---
Aleksandar