This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix sleb128 encoding for difference expressions


Hi,

This is a more conservative version of the patch, which *only* affects
assembly of .sleb128 directives.

On Mon, 12 Nov 2012 08:01:48 +0000
Jan Beulich <JBeulich@suse.com> wrote:

> >>> On 09.11.12 at 18:57, Julian Brown <julian@codesourcery.com>
> >>> wrote:
> > On Thu, 8 Nov 2012 14:02:19 +0000 Jan Beulich <JBeulich@suse.com>
> > wrote:
> >> >>> On 08.11.12 at 14:05, Julian Brown <julian@codesourcery.com>
> >> I could see your change being correct for the <symbol> -
> >> <symbol> case, but certainly not for <constant> - <constant>
> >> (<symbol> - <constant> being questionable).
> > 
> > I'm not sure I understand this comment. Note that <constant> -
> > <constant> is just as broken at the moment for .sleb128...
> 
> It may be as broken, but that's not my point. The point is that,
> without type information, there's no way for you to guess whether
> a difference between constants in particular (but possibly not
> limited to this) is meant to be signed or unsigned, as you can't
> guess whether or not the programmer intended to make use of
> wraparound (I assume you realize that one can detect the
> width the assembler uses for calculations, and adjust code - e.g.
> through macros - accordingly).

I see. I understand your concern, and agree that breaking wraparound
semantics for most assembled constants would be a bad thing -- but I'd
strongly contest that the current semantics for overflowed numbers
with .sleb128 are useful for anything at all, and (at least this new
version of) my patch doesn't affect anything else.

> So before proposing any solutions, I think you first need to
> propose a scheme how things are intended to work. That's
> particularly being complicated by the fact that .sleb128 can
> encode arbitrarily large numbers, whereas calculations are
> being done at fixed widths (i.e. you specifically can't use the top
> bit of a number to tell its signed-ness - that's why I believe the
> X_unsigned flag got introduced in the first place).

The semantics introduced by my patch are quite simple in fact:

* Most assembled values (apart from the arguments of .sleb128
  directives) are left alone.

* Expressions involving + and - are evaluated using an extra bit:
  signedness or unsignedness is irrelevant at this stage, since we're
  using two's-complement binary, so only the interpretation of the bits
  matters, not the operations themselves [*].

* Operands (symbols, constants) are positive word-sized values (i.e.
  they have the "extra bit" clear).

* Unary negation operates on all bits of the value, including the extra
  bit.

* Arguments of .sleb128 directives are always sign-extended, using the
  new extra bit (i.e. the "word size + 1 bit" value is sign-extended).

Ideally .sleb128 expressions would be evaluated in arbitrary precision,
but that's just not practical as the code is currently written, and
would probably be overkill in general anyway. My patch, providing just a
little more precision where necessary, ought to cover all useful cases.

I understand that the X_unsigned flag was also intended to solve this
problem, but it clearly doesn't -- I'm sure you've seen the comment:

     FIXME: This field is not set very reliably.  */
  unsigned int X_unsigned : 1;

in fact the only thing which sets (clears it, rather) it is unary
negation. That works (for .sleb128) in the simple case,

  .sleb128 -(1<<63)

but not for any of the cases previously discussed, nor for e.g.:

  .sleb128 --(1<<63)

Nevertheless, any current code which relies on the current strange
behaviour of X_unsigned will continue to do the same thing (though not
for .sleb128 directives, of course).

> And if this scheme involves _any_ changes to the generic
> expression evaluation code, you would also have to make clear
> why that would be correct (specifically not causing regressions)
> for all other contexts using that code.

The behaviour of the X_unsigned flag and all of the bits of
X_add_number are unchanged -- quite deliberately, so that the patch is
as conservative as possible. The calculation of X_extrabit can be done
independently, without affecting any of the lower-order bits.

Thanks,

Julian

[*] Other operators (multiplication, bitwise operations, etc.) ignore
the X_extrabit field, as they ignore the X_unsigned field at present --
so it's probably still possible to construct expressions which behave
strangely with .sleb128 with my patch applied, using those operators.
The patch covers the actual real-world case we encountered, though.

ChangeLog

    gas/
    * read.c (convert_to_bignum): Add sign parameter. Use it
    instead of X_unsigned to determine sign of resulting bignum.
    (emit_expr): Pass extra argument to convert_to_bignum.
    (emit_leb128_expr): Use X_extrabit instead of X_unsigned. Pass
    X_extrabit to convert_to_bignum.
    (parse_bitfield_cons): Set X_extrabit.
    * expr.c (make_expr_symbol, expr_build_uconstant, operand):
    Initialise X_extrabit field as appropriate.
    (add_to_result): New.
    (subtract_from_result): New.
    (expr): Use above.
    * expr.h (expressionS): Add X_extrabit field.

    gas/testsuite/
    * gas/all/sleb128-2.s: New test.
    * gas/all/sleb128-3.s: Likewise.
    * gas/all/sleb128-4.s: Likewise.
    * gas/all/sleb128-5.s: Likewise.
    * gas/all/sleb128-6.s: Likewise.
    * gas/all/sleb128-7.s: Likewise.
    * gas/all/sleb128-2.d: New.
    * gas/all/sleb128-3.d: New.
    * gas/all/sleb123-4.d: New.
    * gas/all/sleb123-5.d: New.
    * gas/all/sleb123-6.d: New.
    * gas/all/sleb123-7.d: New.
    * gas/all/gas.exp (sleb128-2, sleb128-3, sleb128-4, sleb128-5)
    (sleb128-6, sleb128-7): Run new tests.
Index: gas/read.c
===================================================================
--- gas/read.c	(revision 394269)
+++ gas/read.c	(working copy)
@@ -1312,10 +1312,10 @@ read_a_source_file (char *name)
 }
 
 /* Convert O_constant expression EXP into the equivalent O_big representation.
-   Take the sign of the number from X_unsigned rather than X_add_number.  */
+   Take the sign of the number from SIGN rather than X_add_number.  */
 
 static void
-convert_to_bignum (expressionS *exp)
+convert_to_bignum (expressionS *exp, int sign)
 {
   valueT value;
   unsigned int i;
@@ -1328,8 +1328,8 @@ convert_to_bignum (expressionS *exp)
     }
   /* Add a sequence of sign bits if the top bit of X_add_number is not
      the sign of the original value.  */
-  if ((exp->X_add_number < 0) != !exp->X_unsigned)
-    generic_bignum[i++] = exp->X_unsigned ? 0 : LITTLENUM_MASK;
+  if ((exp->X_add_number < 0) == !sign)
+    generic_bignum[i++] = sign ? LITTLENUM_MASK : 0;
   exp->X_op = O_big;
   exp->X_add_number = i;
 }
@@ -4390,7 +4390,7 @@ emit_expr (expressionS *exp, unsigned in
   if (op == O_constant && nbytes > sizeof (valueT))
     {
       extra_digit = exp->X_unsigned ? 0 : -1;
-      convert_to_bignum (exp);
+      convert_to_bignum (exp, !exp->X_unsigned);
       op = O_big;
     }
 
@@ -4684,6 +4684,7 @@ parse_bitfield_cons (exp, nbytes)
       exp->X_add_number = value;
       exp->X_op = O_constant;
       exp->X_unsigned = 1;
+      exp->X_extrabit = 0;
     }
 }
 
@@ -5243,12 +5244,12 @@ emit_leb128_expr (expressionS *exp, int 
     }
   else if (op == O_constant
 	   && sign
-	   && (exp->X_add_number < 0) != !exp->X_unsigned)
+	   && (exp->X_add_number < 0) == !exp->X_extrabit)
     {
       /* We're outputting a signed leb128 and the sign of X_add_number
 	 doesn't reflect the sign of the original value.  Convert EXP
 	 to a correctly-extended bignum instead.  */
-      convert_to_bignum (exp);
+      convert_to_bignum (exp, exp->X_extrabit);
       op = O_big;
     }
 
Index: gas/testsuite/gas/all/sleb128-3.d
===================================================================
--- gas/testsuite/gas/all/sleb128-3.d	(revision 0)
+++ gas/testsuite/gas/all/sleb128-3.d	(revision 0)
@@ -0,0 +1,7 @@
+#objdump : -s -j .data
+#name : .sleb128 tests (3)
+
+.*: .*
+
+Contents of section \.data:
+ 0000 9c7f2a00 00000000 00000000 00000000  .*
Index: gas/testsuite/gas/all/sleb128-5.d
===================================================================
--- gas/testsuite/gas/all/sleb128-5.d	(revision 0)
+++ gas/testsuite/gas/all/sleb128-5.d	(revision 0)
@@ -0,0 +1,7 @@
+#objdump : -s -j .data
+#name : .sleb128 tests (5)
+
+.*: .*
+
+Contents of section \.data:
+ 0000 012a0000 00000000 00000000 00000000  .*
Index: gas/testsuite/gas/all/gas.exp
===================================================================
--- gas/testsuite/gas/all/gas.exp	(revision 394269)
+++ gas/testsuite/gas/all/gas.exp	(working copy)
@@ -352,6 +352,12 @@ if { ![istarget "bfin-*-*"] } then {
     run_dump_test assign
 }
 run_dump_test sleb128
+run_dump_test sleb128-2
+run_dump_test sleb128-3
+run_dump_test sleb128-4
+run_dump_test sleb128-5
+run_dump_test sleb128-6
+run_dump_test sleb128-7
 
 # .byte is 32 bits on tic4x, and .p2align isn't supported on tic54x
 # .space is different on hppa*-hpux.
Index: gas/testsuite/gas/all/sleb128-7.d
===================================================================
--- gas/testsuite/gas/all/sleb128-7.d	(revision 0)
+++ gas/testsuite/gas/all/sleb128-7.d	(revision 0)
@@ -0,0 +1,7 @@
+#objdump : -s -j .data
+#name : .sleb128 tests (7)
+
+.*: .*
+
+Contents of section \.data:
+ 0000 cb012ac5 012acb01 2ac5012a 00000000  .*
Index: gas/testsuite/gas/all/sleb128-3.s
===================================================================
--- gas/testsuite/gas/all/sleb128-3.s	(revision 0)
+++ gas/testsuite/gas/all/sleb128-3.s	(revision 0)
@@ -0,0 +1,4 @@
+.data
+bar:
+.sleb128 100 - 200
+.byte 42
Index: gas/testsuite/gas/all/sleb128-5.s
===================================================================
--- gas/testsuite/gas/all/sleb128-5.s	(revision 0)
+++ gas/testsuite/gas/all/sleb128-5.s	(revision 0)
@@ -0,0 +1,13 @@
+.text
+.globl foo
+foo:
+.L1:
+.byte 0
+.byte 0
+.byte 0
+.L2:
+
+.data
+bar:
+.sleb128 .L1 - .L2 + 4
+.byte 42
Index: gas/testsuite/gas/all/sleb128-7.s
===================================================================
--- gas/testsuite/gas/all/sleb128-7.s	(revision 0)
+++ gas/testsuite/gas/all/sleb128-7.s	(revision 0)
@@ -0,0 +1,19 @@
+.text
+.globl foo
+foo:
+.L1:
+.byte 0
+.byte 0
+.byte 0
+.L2:
+
+.data
+bar:
+.sleb128 200+(.L2 - .L1)
+.byte 42
+.sleb128 200+(.L1 - .L2)
+.byte 42
+.sleb128 (.L2 - .L1)+200
+.byte 42
+.sleb128 (.L1 - .L2)+200
+.byte 42
Index: gas/testsuite/gas/all/sleb128-2.d
===================================================================
--- gas/testsuite/gas/all/sleb128-2.d	(revision 0)
+++ gas/testsuite/gas/all/sleb128-2.d	(revision 0)
@@ -0,0 +1,7 @@
+#objdump : -s -j .data
+#name : .sleb128 tests (2)
+
+.*: .*
+
+Contents of section \.data:
+ 0000 7d2a0000 00000000 00000000 00000000  .*
Index: gas/testsuite/gas/all/sleb128-4.d
===================================================================
--- gas/testsuite/gas/all/sleb128-4.d	(revision 0)
+++ gas/testsuite/gas/all/sleb128-4.d	(revision 0)
@@ -0,0 +1,7 @@
+#objdump : -s -j .data
+#name : .sleb128 tests (4)
+
+.*: .*
+
+Contents of section \.data:
+ 0000 83808080 082a0000 00000000 00000000  .*
Index: gas/testsuite/gas/all/sleb128-6.d
===================================================================
--- gas/testsuite/gas/all/sleb128-6.d	(revision 0)
+++ gas/testsuite/gas/all/sleb128-6.d	(revision 0)
@@ -0,0 +1,9 @@
+#objdump : -s -j .data
+#name : .sleb128 tests (6)
+
+.*: .*
+
+Contents of section \.data:
+ 0000 83808080 80808080 80012afd ffffffff  .*
+ 0010 ffffffff 002a8380 80808080 80808001  .*
+ 0020 2afdffff ffffffff ffff002a 00000000  .*
Index: gas/testsuite/gas/all/sleb128-2.s
===================================================================
--- gas/testsuite/gas/all/sleb128-2.s	(revision 0)
+++ gas/testsuite/gas/all/sleb128-2.s	(revision 0)
@@ -0,0 +1,13 @@
+.text
+.globl foo
+foo:
+.L1:
+.byte 0
+.byte 0
+.byte 0
+.L2:
+
+.data
+bar:
+.sleb128 .L1 - .L2
+.byte 42
Index: gas/testsuite/gas/all/sleb128-4.s
===================================================================
--- gas/testsuite/gas/all/sleb128-4.s	(revision 0)
+++ gas/testsuite/gas/all/sleb128-4.s	(revision 0)
@@ -0,0 +1,13 @@
+.text
+.globl foo
+foo:
+.L1:
+.byte 0
+.byte 0
+.byte 0
+.L2:
+
+.data
+bar:
+.sleb128 .L2 - .L1 + (1 << 31)
+.byte 42
Index: gas/testsuite/gas/all/sleb128-6.s
===================================================================
--- gas/testsuite/gas/all/sleb128-6.s	(revision 0)
+++ gas/testsuite/gas/all/sleb128-6.s	(revision 0)
@@ -0,0 +1,19 @@
+.text
+.globl foo
+foo:
+.L1:
+.byte 0
+.byte 0
+.byte 0
+.L2:
+
+.data
+bar:
+.sleb128 1<<63+(.L2 - .L1)
+.byte 42
+.sleb128 1<<63+(.L1 - .L2)
+.byte 42
+.sleb128 (.L2 - .L1)+1<<63
+.byte 42
+.sleb128 (.L1 - .L2)+1<<63
+.byte 42
Index: gas/expr.c
===================================================================
--- gas/expr.c	(revision 394269)
+++ gas/expr.c	(working copy)
@@ -90,6 +90,7 @@ make_expr_symbol (expressionS *expressio
       zero.X_op = O_constant;
       zero.X_add_number = 0;
       zero.X_unsigned = 0;
+      zero.X_extrabit = 0;
       clean_up_expression (&zero);
       expressionP = &zero;
     }
@@ -161,6 +162,7 @@ expr_build_uconstant (offsetT value)
   e.X_op = O_constant;
   e.X_add_number = value;
   e.X_unsigned = 1;
+  e.X_extrabit = 0;
   return make_expr_symbol (&e);
 }
 
@@ -732,6 +734,7 @@ operand (expressionS *expressionP, enum 
      something like ``.quad 0x80000000'' is not sign extended even
      though it appears negative if valueT is 32 bits.  */
   expressionP->X_unsigned = 1;
+  expressionP->X_extrabit = 0;
 
   /* Digits, assume it is a bignum.  */
 
@@ -1026,6 +1029,8 @@ operand (expressionS *expressionP, enum 
 		   This is compatible with other people's
 		   assemblers.  Sigh.  */
 		expressionP->X_unsigned = 0;
+		if (expressionP->X_add_number)
+		  expressionP->X_extrabit ^= 1;
 	      }
 	    else if (c == '~' || c == '"')
 	      expressionP->X_add_number = ~ expressionP->X_add_number;
@@ -1078,6 +1083,7 @@ operand (expressionS *expressionP, enum 
 		expressionP->X_add_number = i >= expressionP->X_add_number;
 		expressionP->X_op = O_constant;
 		expressionP->X_unsigned = 1;
+		expressionP->X_extrabit = 0;
 	      }
 	  }
 	else if (expressionP->X_op != O_illegal
@@ -1717,6 +1723,42 @@ operatorf (int *num_chars)
   /* NOTREACHED  */
 }
 
+/* Implement "word-size + 1 bit" addition for
+   {resultP->X_extrabit:resultP->X_add_number} + {rhs_highbit:amount}.  This
+   is used so that the full range of unsigned word values and the full range of
+   signed word values can be represented in an O_constant expression, which is
+   useful e.g. for .sleb128 directives.  */
+
+static void
+add_to_result (expressionS *resultP, offsetT amount, int rhs_highbit)
+{
+  valueT ures = resultP->X_add_number;
+  valueT uamount = amount;
+
+  resultP->X_add_number += amount;
+
+  resultP->X_extrabit ^= rhs_highbit;
+
+  if (ures + uamount < ures)
+    resultP->X_extrabit ^= 1;
+}
+
+/* Similarly, for subtraction.  */
+
+static void
+subtract_from_result (expressionS *resultP, offsetT amount, int rhs_highbit)
+{
+  valueT ures = resultP->X_add_number;
+  valueT uamount = amount;
+
+  resultP->X_add_number -= amount;
+
+  resultP->X_extrabit ^= rhs_highbit;
+
+  if (ures < uamount)
+    resultP->X_extrabit ^= 1;
+}
+
 /* Parse an expression.  */
 
 segT
@@ -1829,7 +1871,7 @@ expr (int rankarg,		/* Larger # is highe
 	  && (md_register_arithmetic || resultP->X_op != O_register))
 	{
 	  /* X + constant.  */
-	  resultP->X_add_number += right.X_add_number;
+	  add_to_result (resultP, right.X_add_number, right.X_extrabit);
 	}
       /* This case comes up in PIC code.  */
       else if (op_left == O_subtract
@@ -1847,10 +1889,11 @@ expr (int rankarg,		/* Larger # is highe
 				       symbol_get_frag (right.X_add_symbol),
 				       &frag_off))
 	{
-	  resultP->X_add_number -= right.X_add_number;
-	  resultP->X_add_number -= frag_off / OCTETS_PER_BYTE;
-	  resultP->X_add_number += (S_GET_VALUE (resultP->X_add_symbol)
-				    - S_GET_VALUE (right.X_add_symbol));
+	  offsetT symval_diff = S_GET_VALUE (resultP->X_add_symbol)
+				- S_GET_VALUE (right.X_add_symbol);
+	  subtract_from_result (resultP, right.X_add_number, right.X_extrabit);
+	  subtract_from_result (resultP, frag_off / OCTETS_PER_BYTE, 0);
+	  add_to_result (resultP, symval_diff, symval_diff < 0);
 	  resultP->X_op = O_constant;
 	  resultP->X_add_symbol = 0;
 	}
@@ -1858,7 +1901,7 @@ expr (int rankarg,		/* Larger # is highe
 	       && (md_register_arithmetic || resultP->X_op != O_register))
 	{
 	  /* X - constant.  */
-	  resultP->X_add_number -= right.X_add_number;
+	  subtract_from_result (resultP, right.X_add_number, right.X_extrabit);
 	}
       else if (op_left == O_add && resultP->X_op == O_constant
 	       && (md_register_arithmetic || right.X_op != O_register))
@@ -1867,7 +1910,7 @@ expr (int rankarg,		/* Larger # is highe
 	  resultP->X_op = right.X_op;
 	  resultP->X_add_symbol = right.X_add_symbol;
 	  resultP->X_op_symbol = right.X_op_symbol;
-	  resultP->X_add_number += right.X_add_number;
+	  add_to_result (resultP, right.X_add_number, right.X_extrabit);
 	  retval = rightseg;
 	}
       else if (resultP->X_op == O_constant && right.X_op == O_constant)
@@ -1907,7 +1950,9 @@ expr (int rankarg,		/* Larger # is highe
 	      /* Constant + constant (O_add) is handled by the
 		 previous if statement for constant + X, so is omitted
 		 here.  */
-	    case O_subtract:		resultP->X_add_number -= v; break;
+	    case O_subtract:
+	      subtract_from_result (resultP, v, 0);
+	      break;
 	    case O_eq:
 	      resultP->X_add_number =
 		resultP->X_add_number == v ? ~ (offsetT) 0 : 0;
@@ -1951,10 +1996,11 @@ expr (int rankarg,		/* Larger # is highe
 	  resultP->X_op = op_left;
 	  resultP->X_op_symbol = right.X_add_symbol;
 	  if (op_left == O_add)
-	    resultP->X_add_number += right.X_add_number;
+	    add_to_result (resultP, right.X_add_number, right.X_extrabit);
 	  else if (op_left == O_subtract)
 	    {
-	      resultP->X_add_number -= right.X_add_number;
+	      subtract_from_result (resultP, right.X_add_number,
+				    right.X_extrabit);
 	      if (retval == rightseg
 		  && SEG_NORMAL (retval)
 		  && !S_FORCE_RELOC (resultP->X_add_symbol, 0)
@@ -1974,6 +2020,7 @@ expr (int rankarg,		/* Larger # is highe
 	  resultP->X_op = op_left;
 	  resultP->X_add_number = 0;
 	  resultP->X_unsigned = 1;
+	  resultP->X_extrabit = 0;
 	}
 
       if (retval != rightseg)
Index: gas/expr.h
===================================================================
--- gas/expr.h	(revision 394269)
+++ gas/expr.h	(working copy)
@@ -136,6 +136,11 @@ typedef struct expressionS {
      when performing arithmetic on these values).
      FIXME: This field is not set very reliably.  */
   unsigned int X_unsigned : 1;
+  /* This is used to implement "word size + 1 bit" arithmetic, so that e.g.
+     expressions used with .sleb128 directives can use the full range available
+     for an unsigned word, but can also properly represent all values of a
+     signed word.  */
+  unsigned int X_extrabit : 1;
 
   /* 7 additional bits can be defined if needed.  */
 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]