This is the mail archive of the gdb@sourceware.cygnus.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Regarding Range Table



[To gdb@sourceware readers: Sekaran is working on helping GDB debug
optimized code.  He would like to extend the way GDB represents
variables that live in different places at different times --- for
example, a variable that mostly lives on the stack, except in the body
of a particular loop, where it is brought into a register.

GDB represents such variables using multiple struct symbol objects,
one for each home the variable occupies.  The `ranges' element of the
`struct symbol' lists the addresses at which that `struct symbol' is
appropriate.

This is a weird representation, since you can have several struct
symbol objects corresponding to a single variable.  It would be more
intuitive to identify each variable with a single `struct symbol'
object, and have that object carry a mapping from code addresses to
homes.  But here we are.]



Sekaran Nanja <snanja@cup.hp.com> writes:
 
Jim> Would it be okay if we had this discussion on gdb@sourceware.cygnus.com?

Sekaran> I think it is OK.

Jim> It's fine with me to extend the range_list struct, as long as the new
Jim> semantics are very clear.  If they're not well-explained, or they're
Jim> muddy, then we won't be able to maintain it.
Jim>
Jim> So I need to understand exactly what you want to do to symbols and
Jim> their range lists, and how to interpret the resulting structures.

Sekaran> Please note that HP captures the following information for live
Sekaran> ranges of variables: Symbol Address (Register - but this can
Sekaran> possibly change for different code ranges)

Jim> This is just a mapping from code addresses onto variable homes, right?
Jim> To represent this information, no changes to GDB's symbol structures
Jim> are necessary, right?

Sekaran> We don't need to change the symbol struct but we need to
Sekaran> change range_list struct to support this.  Please note that
Sekaran> Register value for the assigned symbol may change in code
Sekaran> ranges and so we need to keep the address info. in
Sekaran> range_list. I am planning to change the struct range_list as
Sekaran> follows:
Sekaran> 
Sekaran> struct range_list {
Sekaran>          unsigned int  set_early : 1;
Sekaran>          unsigned int set_late : 1;
Sekaran>          unsigned int unknown : 1;
Sekaran>          unsigned int reserved  : 29
Sekaran> 
Sekaran>         enum  address_class  aclass BYTE_BITFIELD;
Sekaran>         CORE_ADDR   symbol_address;
Sekaran>          int  comes_from_line;
Sekaran>          int  moved_to_line;
Sekaran> 
Sekaran>         CORE_ADDR  start;
Sekaran>         CORE_ADDR  end;
Sekaran>          struct range_list  *next;
Sekaran> }

Sekaran> Set Early/ Set Late - Flag for critical point assignment (To provide
Sekaran> useful warning to the user) and possibly associated line numbers for
Sekaran> these.
 
Jim> I don't understand what this is.  Could you explain it in more
Jim> detail?

Sekaran> Please note that due to optimization, critical assignment
Sekaran> statement(s) can possibly moved around.  In these cases, the
Sekaran> user needs to be warned regarding this optimization by
Sekaran> displaying warning to the user that either the assignment has
Sekaran> taken place earlier at a specific line or it is going to
Sekaran> happen at a specific line.  Please note that the following
Sekaran> newly added fields to range_list are used support this
Sekaran> feature:
Sekaran>
Sekaran> set_early
Sekaran> set_late
Sekaran> comes_from_line
Sekaran> moved_to_line.


I am still not clear on what set_early, set_late, comes_from_line and
moved_to_line would mean, but I am guessing that they are meant to
handle cases like these:

- a variable which is live in the source code at a given point is not
  actually live in the optimized code which the user is debugging,
  because the compiler has moved an assignment to the variable from
  before the current source line to after the current source line.

- a variable is live in the source code at a given point, but all its
  uses have been moved above it, along with *another* assignment to
  the variable.  So the variable is live, but its value isn't going to
  go where you'd expect.

(Here, "before" and "after" refer to control flow, not to source
positions).

Is that correct?

GDB's representation for split live ranges is not very intuitive, and
I don't think you're using it correctly.

GDB uses a separate struct symbol for each home a variable might have.
A struct symbol's `ranges' list lists those address ranges over which
the variable home given in that struct symbol applies.

So, for example, if a variable `foo' usually lives on the stack, but
lives in register 5 from code addresses 0x1020 and 0x1030, and
register 6 from code addresses 0x1040 to 0x1050, then you would have:

- the primary struct symbol, whose aclass is LOC_LOCAL, and whose
  SYMBOL_VALUE is the offset from the frame base, whose `ranges' list
  is empty, and whose `aliases' list contains two struct symbols:

  - the first alias symbol, whose aclass is LOC_REGISTER and whose
    SYMBOL_VALUE is 5, and whose `ranges' list contains the single
    element, {start=0x1020, end=0x1030, next=0}

  - the second alias symbol, whose aclass is LOC_REGISTER and whose
    SYMBOL_VALUE is 6, and whose `ranges' list contains the single
    element, {start=0x1040, end=0x1050, next=0}.

All of these struct symbols have the name `foo', and all refer to the
same variable.  (I think this is crummy, but that's the way it is.)

In other words, when a variable has several homes, GDB doesn't
represent this using a single struct symbol with several elements in
its `ranges' list, each one specifying a particular home for that
range.  Instead, GDB represents this using multiple `struct symbol'
objects, each of which probably has a single element in its `ranges'
list.

The only case where a symbol's `ranges' list would have more than one
element is if the variable has the same home in several distinct
ranges.

An OP_VAR_VALUE node of a GDB `struct expression' will contain a
reference to the symbol object appropriate for the location at which
the expression will be evaluated --- it could be a the primary or an
alias.  The expression evaluator assumes that the home in the struct
symbol is correct.

So instead of adding `aclass' and `symbol_address' to struct
range_list, it seems to me that you should instead build a separate
struct symbol for each home the variable might have.

It makes more sense to me to add your new info to the struct symbol
itself, so that value_of_variable can check it and complain
appropriately.  It would be nice to do it in a way which doesn't use
much space if we don't have this kind of information for the symbol.

What do you think?

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]