This is the mail archive of the archer@sourceware.org mailing list for the Archer project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Calculating array length

From: Jan Kratochvil <jan dot kratochvil at redhat dot com>
To: Joost van der Sluis <joost at cnoc dot nl>
Cc: archer at sourceware dot org
Date: Sun, 7 Jun 2009 19:49:24 +0200
Subject: Re: Calculating array length
References: <1244370173.22994.14.camel@wsjoost> <20090607144745.GA21154@host0.dyn.jankratochvil.net> <1244390772.8081.39.camel@wsjoost>

On Sun, 07 Jun 2009 18:06:12 +0200, Joost van der Sluis wrote:
> My problem is that val_print_array_elements (in valprint.c) derrives the
> amount of elements by dividing the TYPE_LENGTH(type) by
> TYPE_LENGTH(eltype). (Both lengths are calculated by a call to
> check_typedef which calls type_length_get). This is clearly wrong when
> those lengths are calculated as done above.
> 
> It's solved very easily by removing that calculation and let the bounds
> determine the size (that code is already there in an else-statement).

Yes, I find it as the right fix to use DW_AT_upper_bound/DW_AT_lower_bound
instead which should always work.  Byte sizes should not be dealt with during
finding out the number of array elements.

> How can I make it to use FULL_SPAN in this case?

There is no need to use TYPE_LENGTH in this code so no need to deal with
FULL_SPAN, right?

> > Subroutine `sub' will print:
> >            2
> >            4
> > Subroutine `sub' know only about a table with 2 rows and 1 column.  To make it
> > working with the original array `a' memory layout without any copy the
> > pointers to the array are setup as:
> >   array start: row 1 column 2 (element content `2')
> >   rows, therefore number of elements of p: 2
> >   columns, therefore number of elements of p row: 1
> >   element size of p (one row byte length): sizeof (integer) * 1
> >   element size of p row (one element byte length): sizeof (integer)
> >   byte stride of p (offset to the next row): sizeof (integer) * 2
> >   byte stride of p row: sizeof (integer)
> 
> This is not true. As how I understood the Dwarf-3 specs, the stride
> defines the size which is used to store the entry in the array, when it
> is not the same as it's element's length.
> 
> ie: it is not the offset to the next row, it is the size of each row. So
> also the latest entry should have this size.

There is a IMO contradiction in DWARF, it talks just about the position:
	DW_AT_byte_stride [...] attribute which specifies the separation
	between successive elements along the dimension
but also about the "allocated size":
	If the amount of storage allocated to hold each element of an object
	of the given array type is different from the amount of storage that
	is normally allocated to hold an individual object of the indicated
	element type, then the array type entry has a DW_AT_bit_stride
	attribute, whose value (see Section 2.19) is the size in bits of each
	element of the array.

As there is no other DWARF way than the byte-stride how to express the
subwindow of the array in my Fortran example compiler would have to allocate
unused memory after the end of the array just to make the debugging possible.
Runtime must not be degraded just for the debugging purposes.

> > If we would always use FULL_SPAN true then GDB would transfer in this case
> > a contiguous memory block with content {2,3,4,X}.  But X is after the end of
> > the array and for very large arrays (thousands of elements or elements of size
> > in kilobytes) memory for X may no longer be mapped and GDB would fail
> > retrieving the memory of variable being wished to be printed.  (+It would be
> > also less effective.)
> 
> I don't know anything about Fortran, but as far as I can see it has to
> define a new 1-dimensional array with 2 items which is passed to sub.
> Then it has to generate new debug-information which contain the
> information for that array. 

gfortran generates only the new debug information and new so-called
"descriptor" in the memory (*).  The array elements data are not copied as the
memory must remain modified after returning from the callee (and it would be
also very ineffective to copy the memory).

(*) When the callee accepts variable array bounds such as (:,:).  As I see now
    I made the example _too_ simplified so the array got passed as a copy not
    exploiting the sharing of memory with a different descriptor.  Such memory
    sharing still can be seen in `gdb.fortran/dynamic.f90'.

> Consider a (static, so fixed-size) pascal/fpc-array which elements are
> ansistrings. Ansistrings in pascal stored as pointers to an array of
> chars. These ansistrings are in Dwarf-3 defined using the dw_at_location
> attribute to point to the real data in the array.
> 
> So, the length of the Ansistring-type has to return the length of the
> actual stored string. This is _not_ the length of the pointer which
> points to the actual data. But when you calculate the array-size, you
> have to use the size of a pointer. That's why you have to store that
> size in the stride...
> 
> Example: 
> 
> array[0..2] of string;
> 0x00: pointer1 -> 0x534643: 'string 1'      (size=8)
> 0x08: pointer2 -> 0x734644: 'str 2'         (size=5)
> 0x10: pointer3 -> 0x334554: 'long string 3' (size=13)

Understood your explanation, thanks.

Guessing there should be new DWARF DW_AT_data_byte_size to be used in such
case together with DW_AT_byte_size.  DW_AT_data_byte_size would specify the
variable size (8, 5, 13) and DW_AT_byte_size would specify the descriptor size
(8 bytes in this case, assuming you used 32bit arch here).

As there is currently no DW_AT_data_byte_size attribute GDB should use
FULL_SPAN when it is unsure and do the non-FULL_SPAN tail optimization to
avoid accessing unmapped memory for large Fortran arrays when it cannot harm.
Cannot harm vs. unsure can be determined when the _element_ does not use
DW_AT_data_location.

> Second problem is: which data should be copied to the inferior when you
> read this array? Well the answer is simple: only the pointers. So the
> compiler adds the stride-option to the debug-information, and gdb simply
> has to copy count*stride bytes to the inferior.

Yes, as in this case DW_AT_data_location is in effect.

> Thereafter val_print_array_elements has to evaluate each element, by
> setting object-addres to the right pointer, and then evaluate the length
> of the string, evaluate the length and copy it to the inferior...

Yes, this is described in DWARF for DW_OP_push_object_address
	This object may correspond to [...] a component of an array [...]
	whose address has been dynamically determined by an earlier step
	during user expression evaluation.

> I have this all working,

Would you submit the patch to integrate it into archer-jankratochvil-vla?
Thanks.

> only problems I still have is with all the
> calls to all sort of properties of the type, while the object-address is
> pointing to something different, so that the sizes don't match.

I agree it is "unfortunate" GDB currently does not distinguish much between
the object-address and data-address and accesses them interchangeably.

There should be both VALUE_ADDRESS and some "VALUE_DATA_ADDRESS" kept around.

To make the VLA patch maintainable separately there is currently a function
object_address_get_data which is called at the right moment when GDB starts to
deal with the data themselves instead of the value object in general.
object_address_get_data currently switches the VALUE_ADDRESS content+meaning
to that hypothetical VALUE_DATA_ADDRESS.

Thanks,
Jan

Follow-Ups:
- Re: Calculating array length
  - From: Jan Kratochvil
- Re: Calculating array length
  - From: Joost van der Sluis
- Re: Calculating array length
  - From: Joost van der Sluis

References:
- Calculating array length
  - From: Joost van der Sluis
- Re: Calculating array length
  - From: Jan Kratochvil
- Re: Calculating array length
  - From: Joost van der Sluis

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]