This is the mail archive of the gdb-patches@sourceware.org mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[RFC/ia64] memory error when reading wrong core file


Hello,

We noticed a slight change of behavior in the scenario below.
The general idea of our testcase is to verify that GDB warns the user
when it notices that the user tries to debug a core file that has not
been produced by the same executable as the one already loaded:

    % ./crash
    zsh: 344 abort (core dumped)  ./crash
    % gdb call_crash    <<<--- executable name is different
    (gdb) core core     <<<--- so wrong core file used here

We expect GDB to print a warning in that case, since the user probably
picked the wrong core:

    (gdb) core core
    warning: core file may not match specified executable file.
    [...]

However, the testcase that we used at AdaCore also tested various
other things, while we're testing core file support. In particular,
we were testing the fact that GDB was able to print the name of
the executable that produced this core file. See the output produced
by GDB as of a couple of weeks ago:

    (gdb) core core
    warning: core file may not match specified executable file.
    [New Thread 5437]
    [traces about symbols being read from shared libs]
    Core was generated by `./crash'.
    Program terminated with signal 6, Aborted.
    #0  0xa000000000010640 in __kernel_syscall_via_break ()

Contrast this with what we get today, on ia64-linux:

    (gdb) core core
    warning: core file may not match specified executable file.
    [New Thread 5437]
    Cannot access memory at address 0x1000000000009

The change of behavior is related to a patch that changed the way
the solib base address gets computed (that was for PIE, patch 12/15
I believe).  Prior to that patch, the computed base address was zero.
But I believe that this was totally by accident: Since the solib base
is computed using the .dynamic data of the executable, and because
there is a discrepancy between the executable and the core file,
the base address means nothing in any case.

What I think is happening in this case is that we're being a little
less lucky than before, and end up tripping a memory error while we were
lucky before to return a null base address, which itself allowed us to
fallback on:

  /* If we can't find the dynamic linker's base structure, this
     must not be a dynamically linked executable.  Hmm.  */
  if (! info->debug_base)
    return svr4_default_sos ();

This explains the change of behavior, which is not entirely unreasonable.
That being said, I think that the new behavior is less useful for the user.
It was nice that GDB was able to print the name of the executable,
particularly in this case where the user probably just picked the wrong
core file.

What I suggest, is to catch all errors while trying to read the shared
library map, and try to continue without.  Something like the attached?

It gives me the following output:

    (gdb) core core
    warning: exec file is newer than core file.
    [New Thread 5437]
    warning: Can't read pathname for load map: Input/output error.
    Reading symbols from /lib/tls/libc.so.6.1...(no debugging symbols found)...done.
    Loaded symbols for /lib/tls/libc.so.6.1
    Reading symbols from /lib/ld-linux-ia64.so.2...(no debugging symbols found)...done.
    Loaded symbols for /lib/ld-linux-ia64.so.2
    Core was generated by `./crash'.
    Program terminated with signal 6, Aborted.
    #0  0xa000000000010640 in __kernel_syscall_via_break ()

It pretty much brings back the previous behavior.  Although we still
compute a non-null solib base address, solib_svr4_r_map now notices
the memory error while trying to load the map, and returns zero.
As a result, the caller (svr4_current_sos) finds that its list of
SOs is empty, and thus falls back again on svr4_default_sos.

gdb/ChangeLog:

        * solib-svr4.c (solib_svr4_r_map): catch all exceptions while
        reading the inferior memory, and return zero if an exception
        was raised.

As mentioned on IRC, I could not test this on ia64-linux with the official
testsuite, as the testsuite run immediately crashes the two machines on
which I tried (after which I was firmly prohibited from making any additional
attempt on any other machines). I did test it on this platform (ia64-linux)
but with AdaCore's testsuite.  I also ran the testsuite on x86_64-linux.
I asked Jan, if he has a moment, to try to run the testsuite on his side.

Any thoughts on this?

-- 
Joel
diff --git a/gdb/solib-svr4.c b/gdb/solib-svr4.c
index e497364..38ae126 100644
--- a/gdb/solib-svr4.c
+++ b/gdb/solib-svr4.c
@@ -835,9 +835,15 @@ solib_svr4_r_map (struct svr4_info *info)
 {
   struct link_map_offsets *lmo = svr4_fetch_link_map_offsets ();
   struct type *ptr_type = builtin_type (target_gdbarch)->builtin_data_ptr;
+  CORE_ADDR addr = 0;
+  volatile struct gdb_exception ex;
 
-  return read_memory_typed_address (info->debug_base + lmo->r_map_offset,
-				    ptr_type);
+  TRY_CATCH (ex, RETURN_MASK_ERROR)
+    {
+      addr = read_memory_typed_address (info->debug_base + lmo->r_map_offset,
+                                        ptr_type);
+    }
+  return addr;
 }
 
 /* Find r_brk from the inferior's debug base.  */

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]