Bug in POSIX.2 regex word boundary matching

Corinna Vinschen corinna-cygwin@cygwin.com
Tue Mar 14 10:35:00 GMT 2006


On Mar 14 19:37, dominique.pelle@free.fr wrote:
> Bleah. #include statements were missing in my
> previously posted sample test case.  Here
> is the test case again with #include statements
> this time:
> 
> $ cat regex-bug.c
> 
> #include <stdio.h>
> #include <regex.h>
> #include <stdlib.h>
> 
> int main()
> {
>   regex_t    r;
>   regmatch_t pmatch[2];
> 
>   if (regcomp(&r, "\\bfoobar\\b", REG_EXTENDED) != 0) {
>     fprintf(stderr, "regcomp failed\n");
>     exit(-1);
>   }
> 
>   /* I'd expect above regex to match following string */
>   if (regexec(&r, "test foobar test", 2, pmatch, 0) == 0) {
>     fprintf(stderr, "OK (match)\n");  /* expected behavior */
>   } else {
>     fprintf(stderr, "FAIL (mismatch)\n"); /* unexpected!? */
>   }
>   return 0;
> }
> 
> $ gcc regex-bug.c
> $ ./a.out
> 
> Outcome on Cywgin ................ FAIL (mismatch)
> 
> Outcome on Linux (Ubuntu-5.10) ... OK (match)

Linux uses the glibc GNU regex library, which allows extensions known
from perl, like \b, \w.  Cygwin's regex is Henry Spencer's implementaton
which does not know these extensions.  Note that the POSIX standard
of regular expressions does not contain these extensions, see also
http://www.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap09.html


Corinna

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list