This is the mail archive of the cygwin-patches mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: memmem issues


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

According to Eric Blake on 12/19/2007 2:22 PM:
> memmem isn't standardized, but even so, it has some bugs based on comparison 
> with strstr.
> 
> memmem(haystack,len,"",0) should return haystack, not NULL.  This should be an 
> easy one-line fix.
> 
> memmem currently has O(m*n) worst-case complexity (quadratic, when haystack and 
> needle are both long).  But with the Knuth-Morris-Pratt algorithm and a memory 
> allocation of size n, this could be trimmed to O(m+n) (linear).

Here's a patch:

2007-12-20  Eric Blake  <ebb9@byu.net>

	* libc/memmem.cc (memmem): Fix bug when searching for empty
	string.  Document O(n^2) bug.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             ebb9@byu.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHaekM84KuGfSFAYARArgXAJ9lz/Rk3GnD4i5q4Jy557TzVQoGoQCfb61v
4TUb18LiAQIdCasBx9nwFdw=
=5Vbo
-----END PGP SIGNATURE-----
Index: libc/memmem.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/libc/memmem.cc,v
retrieving revision 1.1
diff -u -p -r1.1 memmem.cc
--- libc/memmem.cc	8 Nov 2005 22:08:39 -0000	1.1
+++ libc/memmem.cc	20 Dec 2007 03:56:35 -0000
@@ -45,8 +45,8 @@ memmem (const void *l, size_t l_len,
   const char *cs = (const char *)s;
 
   /* we need something to compare */
-  if (l_len == 0 || s_len == 0)
-    return NULL;
+  if (s_len == 0)
+    return l;
 
   /* "s" must be smaller or equal to "l" */
   if (l_len < s_len)
@@ -59,6 +59,11 @@ memmem (const void *l, size_t l_len,
   /* the last position where its possible to find "s" in "l" */
   last = (char *) cl + l_len - s_len;
 
+  /* FIXME - this algorithm is worst-case O(l_len*s_len), but using
+     Knuth-Morris-Pratt would be O(l_len+s_len) at the expense of a
+     memory allocation of s_len bytes.  Consider rewriting this to
+     swap over the KMP if the first few iterations fail, and back to
+     this if KMP can't allocate enough memory.  */
   for (cur = (char *) cl; cur <= last; cur++)
     if (cur[0] == cs[0] && memcmp (cur, cs, s_len) == 0)
       return cur;

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]