This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[RFC] Saturating using arithmetic in address calculation.

From: OndÅej BÃlka <neleai at seznam dot cz>
To: Paul Eggert <eggert at cs dot ucla dot edu>
Cc: "Joseph S. Myers" <joseph at codesourcery dot com>, libc-alpha at sourceware dot org
Date: Sat, 26 Oct 2013 17:07:55 +0200
Subject: [RFC] Saturating using arithmetic in address calculation.
Authentication-results: sourceware.org; auth=none
References: <20131025084008 dot GA17443 at domone> <526A319B dot 80909 at suse dot com> <Pine dot LNX dot 4 dot 64 dot 1310251530510 dot 12877 at digraph dot polyomino dot org dot uk> <526A92D7 dot 6040501 at cs dot ucla dot edu>

On Fri, Oct 25, 2013 at 08:48:39AM -0700, Paul Eggert wrote:
> Joseph S. Myers wrote:
> > Note that, for efficiency, in such an (inline / macro) wrapper you want to 
> > distinguish the argument counting the number of elements (not constant) 
> > and the argument counting the size of an element (almost always constant, 
> > so SIZE_MAX / size gets folded to a constant) - and make sure each call 
> > passes the arguments in the right order, which may not always be the order 
> > the existing multiplication is written in.
> 
> An example of exactly that is the xnmalloc function in gnulib's lib/xalloc.h,
> which uses the macro xalloc_oversized(N, S) to detect arithmetic overflow
> efficiently when allocating an array of N objects, each of size S.
> Please feel free to adapt or copy that code into glibc.
> 
> http://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/xalloc.h
> http://git.savannah.gnu.org/cgit/gnulib.git/tree/lib/xalloc-oversized.h

After pondering that a bit I come to that what is really needed is
computing size in saturated arithmetic. Overhead is only a jump on
overflow so we save at least one comparison instruction.

Also this would handle malloc (x * y + z) 

Also there would be less branching needed as in overflow case we call
malloc (-1) which fails.

For that performance a assembly version/gcc builtin would be needed. A
portable implementation would have comparable performance with manual
checks.

Example how would look like is this:

diff --git a/string/strcoll_l.c b/string/strcoll_l.c
index 4ee101a..76d0d9f 100644
--- a/string/strcoll_l.c
+++ b/string/strcoll_l.c
@@ -524,17 +524,11 @@ STRCOLL (const STRING_TYPE *s1, const STRING_TYPE *s2, __locale_t l)
   memset (&seq1, 0, sizeof (seq1));
   seq2 = seq1;
 
-  size_t size_max = SIZE_MAX / (sizeof (int32_t) + 1);
+  size_t size_needed = mul_s (s1len + s2len, sizeof (int32_t) + 1);
 
-  if (MIN (s1len, s2len) > size_max
-      || MAX (s1len, s2len) > size_max - MIN (s1len, s2len))
+  if (!__libc_use_alloca (size_needed))
     {
-      /* If the strings are long enough to cause overflow in the size request,
-         then skip the allocation and proceed with the non-cached routines.  */
-    }
-  else if (! __libc_use_alloca ((s1len + s2len) * (sizeof (int32_t) + 1)))
-    {
-      seq1.idxarr = (int32_t *) malloc ((s1len + s2len) * (sizeof (int32_t) + 1));
+      seq1.idxarr = (int32_t *) malloc (size_needed);
 
       /* If we failed to allocate memory, we leave everything as NULL so that
 	 we use the nocache version of traversal and comparison functions.  */

Follow-Ups:
- Re: [RFC] Saturating using arithmetic in address calculation.
  - From: Paul Eggert
- Re: [RFC] Saturating using arithmetic in address calculation.
  - From: Joseph S. Myers

References:
- [PATCH] Use calloc (x, y) instead malloc (x * y).
  - From: OndÅej BÃlka
- Re: [PATCH] Use calloc (x, y) instead malloc (x * y).
  - From: Andreas Jaeger
- Re: [PATCH] Use calloc (x, y) instead malloc (x * y).
  - From: Joseph S. Myers
- Re: [PATCH] Use calloc (x, y) instead malloc (x * y).
  - From: Paul Eggert

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]