This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: [PATCH] Add support for tracking/evaluating dwarf2 location expressions


Jim Blandy <jimb@zwingli.cygnus.com> writes:

> Daniel Berlin <dberlin@redhat.com> writes:
> > This patch adds a place in the symbol structure to store the dwarf2
> > location (an upcoming dwarf2read huge rewrite patch uses it), an
> > evaluation maachine to findvar.c, and has read_var_value use the dwarf2
> > location if it's available.
> 
> I think this is the wrong approach to the right idea.
> 
> I think it would be a great idea to give GDB more powerful ways to
> describe the locations of variables, structure fields, and so on.
> This would be a clean alternative to including special-purpose,
> ABI-specific logic in GDB.
Yup.

> 
> For example, one of the nice properties of the way GDB's `struct type'
> represents a structure type in the present code is that it's mostly
> ABI-independent.  Since the debug info spells out the exact position
> of each structure member, GDB doesn't have to know the ABI's structure
> layout rules, alignment requirements, and so on.
> 
> However, if we want to support virtual base classes, it's not clear
> how to extend that structure appropriately.  In the V2 C++ ABI, each
> object contained pointers to its virtual bases.  In the V3 C++ ABI, an
> object's virtual table provides offsets to its virtual bases.  At the
> moment, our `solution' is to provide ABI-specific code to find your
> virtual bases.
> 
> If GDB supported a fully general location expression language, like
> Dwarf 2 location expressions, then `struct type' could simply provide
> a location expression to find each member or base class of the object.
> In most cases, these location expressions would be trivial, but we
> could easily encode both the V2 and V3 C++ ABI behaviors in this
> manner.  It would then become the compiler's responsibility to
> describe the ABI it was using.  We could take a similar approach to
> finding virtual functions.
> 
> 
> That said, I think the core of GDB should be independent of the
> debugging information format.  If GDB does adopt a fully general
> location expression language, then it should have a self-contained
> description, which can be designed to suit GDB's needs.  Then we can
> translate Dwarf2 expressions into that form.

Sure.

However, IMHO, the location expression language that dwarf2 uses is
self-contained, and suited for GDB's needs. Perfectly in fact. It
provides no more or less than necessary to describe our locations,
except where it saved a little space in real use (see below) and
it's trivial to convert other expression languages into it. and The
location list is simply a list of ranges and location expressions,
which is what we would have come up with anyway, judging from GDB structures.

In fact, I'd wager if we tried, we'd come up with something almost
exactly like d2 location expressions:

We would want some kind of simple, stack based language.
Normal binary and unary arithmetic operations, and normal stack
operations (rotate, pop, drop, etc).
Past these, we'd want a way of:
dereferencing the value on top of the stack as if it was a memory address.
A way to load constant values onto the stack
A way to load register values onto the stack
An efficient encoding, able to handle differing sizes for data types
on different platforms.

Well, there ya go, you now have dwarf2 location expressions, possibly
with different opcode names, and maybe a few more or a few less
arithmetic operations, and a few more or a few less built in opcodes
for common operations, to save space or not save space (the base
register+offset opcodes, the literal 1-31 opcodes, the register 1-31
opcodes, the fbreg opcode).

Why bother not just using something we already have a standard for, if
it's going to 99.9% the same?

For fun, I converted the stabs reader to generate these expressions,
in fact.  Didnt' take long at all.

For proof it's self-contained, it's the exception handling format used
by gcc 2.95 and 3 (on a non-SJLJ platform, of course).

libgcc in 2.95 (it was moved to libsupc++ in 3.0,  so that it only
gets linked to c++ programs), in fact, contains the evaluator, and
it's linked with every gcc compiled program.

The only thing DWARF2'ish about the location expressions is that they
are defined in the DWARF2 spec.  They depend in no way on dwarf2 at
all. 

In fact, here's an evaluator for them for GDB. 
Feel free to point out any dwarf2 specific parts (struct dwarf_block,
which besides a prototype for evaluate_dwarf2_locdesc, is the only
thing in dwarf2eval.h, is just:

struct dwarf_block
{
        unsigned int size;
        unsigned char *data;
};

Hardly dwarf2 specific, it's just the chunk of location expression
data, and the size, so we know where it ends.

LEB128 (little endian base 128) is used by Intel's IA64 unwind format, which uses slightly
incompatible and  different opcodes to save space, so LEB128 isn't a
dwarf2 specific thing either).

This code handles the complete language.



#include "defs.h"
#include "symtab.h"
#include "frame.h"
#include "value.h"
#include "gdbcore.h"
#include "elf/dwarf2.h"
#include "dwarf2eval.h"

/* Decode the unsigned LEB128 constant at BUF into the variable pointed to
   by R, and return the new value of BUF.  */

static unsigned char *
read_uleb128 (unsigned char *buf, ULONGEST *r)
{
  unsigned shift = 0;
  ULONGEST result = 0;

  while (1)
    {
      unsigned char byte = *buf++;
      result |= (byte & 0x7f) << shift;
      if ((byte & 0x80) == 0)
	break;
      shift += 7;
    }
  *r = result;
  return buf;
}

/* Decode the signed LEB128 constant at BUF into the variable pointed to
   by R, and return the new value of BUF.  */

static unsigned char *
read_sleb128 (unsigned char *buf, LONGEST *r)
{
  unsigned shift = 0;
  LONGEST result = 0;
  unsigned char byte;
  
  while (1)
    {
      byte = *buf++;
      result |= (byte & 0x7f) << shift;
      shift += 7;
      if ((byte & 0x80) == 0)
	break;
    }
  if (shift < (sizeof (*r) * 8) && (byte & 0x40) != 0)
    result |= - (1 << shift);
  
  *r = result;
  return buf;
}
static CORE_ADDR
execute_stack_op (struct symbol *var,  unsigned char *op_ptr, unsigned char *op_end,
		  struct frame_info *frame, CORE_ADDR initial, value_ptr *currval)
{
  CORE_ADDR stack[64];	/* ??? Assume this is enough. */
  int stack_elt;
  stack[0] = initial;
  stack_elt = 1;
  while (op_ptr < op_end)
    {
      enum dwarf_location_atom op = *op_ptr++;
      ULONGEST result, reg;
      LONGEST offset;
      switch (op)
	{
	case DW_OP_lit0:
	case DW_OP_lit1:
	case DW_OP_lit2:
	case DW_OP_lit3:
	case DW_OP_lit4:
	case DW_OP_lit5:
	case DW_OP_lit6:
	case DW_OP_lit7:
	case DW_OP_lit8:
	case DW_OP_lit9:
	case DW_OP_lit10:
	case DW_OP_lit11:
	case DW_OP_lit12:
	case DW_OP_lit13:
	case DW_OP_lit14:
	case DW_OP_lit15:
	case DW_OP_lit16:
	case DW_OP_lit17:
	case DW_OP_lit18:
	case DW_OP_lit19:
	case DW_OP_lit20:
	case DW_OP_lit21:
	case DW_OP_lit22:
	case DW_OP_lit23:
	case DW_OP_lit24:
	case DW_OP_lit25:
	case DW_OP_lit26:
	case DW_OP_lit27:
	case DW_OP_lit28:
	case DW_OP_lit29:
	case DW_OP_lit30:
	case DW_OP_lit31:
	  result = op - DW_OP_lit0;
	  break;
	  
	case DW_OP_addr:
	  result = (CORE_ADDR) extract_unsigned_integer (op_ptr, TARGET_PTR_BIT / TARGET_CHAR_BIT);
	  op_ptr += TARGET_PTR_BIT / TARGET_CHAR_BIT;
	  break;
	  
	case DW_OP_const1u:
	  result = extract_unsigned_integer (op_ptr, 1);
	  op_ptr += 1;
	  break;
	case DW_OP_const1s:
	  result = extract_signed_integer (op_ptr, 1);
	  op_ptr += 1;
	  break;
	case DW_OP_const2u:
	  result = extract_unsigned_integer (op_ptr, 2);
	  op_ptr += 2;
	  break;
	case DW_OP_const2s:
	  result = extract_signed_integer (op_ptr, 2);
	  op_ptr += 2;
	  break;
	case DW_OP_const4u:
	  result = extract_unsigned_integer (op_ptr, 4);
	  op_ptr += 4;
	  break;
	case DW_OP_const4s:
	  result = extract_signed_integer (op_ptr, 4);
	  op_ptr += 4;
	  break;
	case DW_OP_const8u:
	  result = extract_unsigned_integer (op_ptr, 8);
	  op_ptr += 8;
	  break;
	case DW_OP_const8s:
	  result = extract_signed_integer (op_ptr, 8);
	  op_ptr += 8;
	  break;
	case DW_OP_constu:
	  op_ptr = read_uleb128 (op_ptr, &result);
	  break;
	case DW_OP_consts:
	  op_ptr = read_sleb128 (op_ptr, &offset);
	  result = offset;
	  break;

	case DW_OP_reg0:
	case DW_OP_reg1:
	case DW_OP_reg2:
	case DW_OP_reg3:
	case DW_OP_reg4:
	case DW_OP_reg5:
	case DW_OP_reg6:
	case DW_OP_reg7:
	case DW_OP_reg8:
	case DW_OP_reg9:
	case DW_OP_reg10:
	case DW_OP_reg11:
	case DW_OP_reg12:
	case DW_OP_reg13:
	case DW_OP_reg14:
	case DW_OP_reg15:
	case DW_OP_reg16:
	case DW_OP_reg17:
	case DW_OP_reg18:
	case DW_OP_reg19:
	case DW_OP_reg20:
	case DW_OP_reg21:
	case DW_OP_reg22:
	case DW_OP_reg23:
	case DW_OP_reg24:
	case DW_OP_reg25:
	case DW_OP_reg26:
	case DW_OP_reg27:
	case DW_OP_reg28:
	case DW_OP_reg29:
	case DW_OP_reg30:
	case DW_OP_reg31:	
	  {
	    /* Allocate a buffer to store the register data */
	    char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);	   	    

	    get_saved_register (buf, NULL, NULL, frame, op - DW_OP_reg0, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (op - DW_OP_reg0));
	    if (currval != NULL)
	      {
		store_typed_address (VALUE_CONTENTS_RAW (*currval), SYMBOL_TYPE (var), result);
		VALUE_LVAL (*currval) = not_lval;
	      }
	  }
	  break;
	case DW_OP_regx:   
	  {
	    char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);    

	    op_ptr = read_uleb128 (op_ptr, &reg);
	    get_saved_register (buf, NULL, NULL, frame, reg, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (reg));
	  }
	  break;
	case DW_OP_breg0:
	case DW_OP_breg1:
	case DW_OP_breg2:
	case DW_OP_breg3:
	case DW_OP_breg4:
	case DW_OP_breg5:
	case DW_OP_breg6:
	case DW_OP_breg7:
	case DW_OP_breg8:
	case DW_OP_breg9:
	case DW_OP_breg10:
	case DW_OP_breg11:
	case DW_OP_breg12:
	case DW_OP_breg13:
	case DW_OP_breg14:
	case DW_OP_breg15:
	case DW_OP_breg16:
	case DW_OP_breg17:
	case DW_OP_breg18:
	case DW_OP_breg19:
	case DW_OP_breg20:
	case DW_OP_breg21:
	case DW_OP_breg22:
	case DW_OP_breg23:
	case DW_OP_breg24:
	case DW_OP_breg25:
	case DW_OP_breg26:
	case DW_OP_breg27:
	case DW_OP_breg28:
	case DW_OP_breg29:
	case DW_OP_breg30:
	case DW_OP_breg31:
	  {
	    char *buf = (char*) alloca (MAX_REGISTER_RAW_SIZE);
	    
	    op_ptr = read_sleb128 (op_ptr, &offset);
	    get_saved_register (buf, NULL, NULL, frame, op - DW_OP_breg0, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (op - DW_OP_breg0));
	    result += offset;
	  }
	  break;
	case DW_OP_bregx:
	  {
	    char *buf = (char *) alloca (MAX_REGISTER_RAW_SIZE);
	   
	    op_ptr = read_uleb128 (op_ptr, &reg);
	    op_ptr = read_sleb128 (op_ptr, &offset);
	    get_saved_register (buf, NULL, NULL, frame, reg, NULL);
	    result = extract_address (buf, REGISTER_RAW_SIZE (reg));
	    result += offset;
	  }
	  break;
	case DW_OP_fbreg:
	  {
	    struct symbol *framefunc;
	    unsigned char *datastart;
	    unsigned char *dataend;
	    struct dwarf_block *theblock;

	    framefunc = get_frame_function (frame);
	    op_ptr = read_sleb128 (op_ptr, &offset);
	    theblock = SYMBOL_FRAME_LOC_EXPR (framefunc);
	    datastart = theblock->data;
	    dataend = theblock->data + theblock->size;
	    result = execute_stack_op (var, datastart, dataend, frame, 0, NULL) + offset; 
	  }
	  break;
	case DW_OP_dup:
	  if (stack_elt < 1)
	    abort ();
	  result = stack[stack_elt - 1];
	  break;
	  
	case DW_OP_drop:
	  if (--stack_elt < 0)
	    abort ();
	  goto no_push;
	  
	case DW_OP_pick:
	  offset = *op_ptr++;
	  if (offset >= stack_elt - 1)
	    abort ();
	  result = stack[stack_elt - 1 - offset];
	  break;
	  
	case DW_OP_over:
	  if (stack_elt < 2)
	    abort ();
	  result = stack[stack_elt - 2];
	  break;

	case DW_OP_rot:
	  {
	    CORE_ADDR t1, t2, t3;
	    
	    if (stack_elt < 3)
	      abort ();
	    t1 = stack[stack_elt - 1];
	    t2 = stack[stack_elt - 2];
	    t3 = stack[stack_elt - 3];
	    stack[stack_elt - 1] = t2;
	    stack[stack_elt - 2] = t3;
	    stack[stack_elt - 3] = t1;
	    goto no_push;
	  }
	  
	case DW_OP_deref:
	case DW_OP_deref_size:
	case DW_OP_abs:
	case DW_OP_neg:
	case DW_OP_not:
	case DW_OP_plus_uconst:
	  /* Unary operations.  */
	  if (--stack_elt < 0)
	    abort ();
	  result = stack[stack_elt];
	  
	  switch (op)
	    {
	    case DW_OP_deref:
	      {
		result = (CORE_ADDR) read_memory_unsigned_integer (result, TARGET_PTR_BIT / TARGET_CHAR_BIT);
	      }
	      break;
	      
	    case DW_OP_deref_size:
	      {
		switch (*op_ptr++)
		  {
		  case 1:
		    result = read_memory_unsigned_integer (result, 1);
		    break;
		  case 2:
		    result = read_memory_unsigned_integer (result, 2);
		    break;
		  case 4:
		    result = read_memory_unsigned_integer (result, 4);
		    break;
		  case 8:
		    result = read_memory_unsigned_integer (result, 8);
		    break;
		  default:
		    abort ();
		  }
	      }
	      break;
	      
	    case DW_OP_abs:
	      if ((signed int) result < 0)
		result = -result;
	      break;
	    case DW_OP_neg:
	      result = -result;
	      break;
	    case DW_OP_not:
	      result = ~result;
	      break;
	    case DW_OP_plus_uconst:
	      op_ptr = read_uleb128 (op_ptr, &reg);
	      result += reg;
	      break;
	    }
	  break;
	  
	case DW_OP_and:
	case DW_OP_div:
	case DW_OP_minus:
	case DW_OP_mod:
	case DW_OP_mul:
	case DW_OP_or:
	case DW_OP_plus:
	case DW_OP_le:
	case DW_OP_ge:
	case DW_OP_eq:
	case DW_OP_lt:
	case DW_OP_gt:
	case DW_OP_ne:
	  {
	    /* Binary operations.  */
	    CORE_ADDR first, second;
	    if ((stack_elt -= 2) < 0)
	      abort ();
	    second = stack[stack_elt];
	    first = stack[stack_elt + 1];
	    
	    switch (op)
	      {
	      case DW_OP_and:
		result = second & first;
		break;
	      case DW_OP_div:
		result = (LONGEST)second / (LONGEST)first;
		break;
	      case DW_OP_minus:
		result = second - first;
		break;
	      case DW_OP_mod:
		result = (LONGEST)second % (LONGEST)first;
		break;
	      case DW_OP_mul:
		result = second * first;
		break;
	      case DW_OP_or:
		result = second | first;
		break;
	      case DW_OP_plus:
		result = second + first;
		break;
	      case DW_OP_shl:
		result = second << first;
		break;
	      case DW_OP_shr:
		result = second >> first;
		break;
	      case DW_OP_shra:
		result = (LONGEST)second >> first;
		break;
	      case DW_OP_xor:
		result = second ^ first;
		break;
	      case DW_OP_le:
		result = (LONGEST)first <= (LONGEST)second;
		break;
	      case DW_OP_ge:
		result = (LONGEST)first >= (LONGEST)second;
		break;
	      case DW_OP_eq:
		result = (LONGEST)first == (LONGEST)second;
		break;
	      case DW_OP_lt:
		result = (LONGEST)first < (LONGEST)second;
		break;
	      case DW_OP_gt:
		result = (LONGEST)first > (LONGEST)second;
		break;
	      case DW_OP_ne:
		result = (LONGEST)first != (LONGEST)second;
		break;
	      }
	  }
	  break;
	  
	case DW_OP_skip:
	  offset = extract_signed_integer (op_ptr, 2);
	  op_ptr += 2;
	  op_ptr += offset;
	  goto no_push;
	  
	case DW_OP_bra:
	  if (--stack_elt < 0)
	    abort ();
	  offset = extract_signed_integer (op_ptr, 2);
	  op_ptr += 2;
	  if (stack[stack_elt] != 0)
	    op_ptr += offset;
	  goto no_push;
	  
	case DW_OP_nop:
	  goto no_push;
	  
	default:
	  abort ();
	}
      
      /* Most things push a result value.  */
      if ((size_t) stack_elt >= sizeof(stack)/sizeof(*stack))
	abort ();
      stack[++stack_elt] = result;
    no_push:;
    }
  
  /* We were executing this program to get a value.  It should be
     at top of stack.  */
  if (stack_elt-1 < 0)
    abort ();
  return stack[stack_elt];
}

/* Evaluate a location description, given in THEBLOCK, in the
   context of frame FRAME. */
value_ptr 
evaluate_loc_desc (struct symbol *var, struct frame_info *frame, 
			 struct dwarf_block *theblock, struct type *type)
{
  CORE_ADDR result;
  value_ptr retval;
  retval = allocate_value(type);
  VALUE_LVAL (retval) = lval_memory;
  VALUE_BFD_SECTION (retval) = SYMBOL_BFD_SECTION (var);
  result = execute_stack_op (var, theblock->data, theblock->data+theblock->size, frame, 0, &retval);
  if (VALUE_LVAL (retval) == lval_memory)
    {
      VALUE_LAZY (retval) = 1;
      VALUE_ADDRESS (retval) = result;
    }
  return retval;
}



-- 
"What do batteries run on?
"-Steven Wright


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]