Cygwin bash regexp matching doesn't treat "\b" properly
Eric Blake
ebb9@byu.net
Wed Nov 25 16:06:00 GMT 2009
Dave Korn <dave.korn.cygwin <at> googlemail.com> writes:
>
> $ [[ "foo" =~ [[:\<:]]foo[[:\>:]] ]]; echo $?
> 0
>
> (Note that I had to backslash-escape the < and > there. In other contexts
> that might not be needed.)
But here's something weird with how bash manages quoting inside [[ ]]. If you
add a subexpression, you no longer need to quote < or >:
$ [[ foo =~ ([[:<:]]foo[[:>:]]) ]]; echo $?
0
With further experimentation, it turns out that cygwin's regex(3) does not
understand [[:<:][:>:]] as a character class that accepts either direction of
word boundary (for shame). So, modulo the difference in the number of
subexpressions, the closest representation of \b becomes:
([[:<:]]|[[:>:]])
and an expression to match words that either end in a or begin in b would be:
$ [[ ' b ' =~ ([a ]([[:<:]]|[[:>:]])[b ]) ]]; echo $?
0
$ [[ ' ab ' =~ ([a ]([[:<:]]|[[:>:]])[b ]) ]]; echo $?
1
which looks so much shorter as ([a ]\b[b ])
--
Eric Blake
--
Problem reports: http://cygwin.com/problems.html
FAQ: http://cygwin.com/faq/
Documentation: http://cygwin.com/docs.html
Unsubscribe info: http://cygwin.com/ml/#unsubscribe-simple
More information about the Cygwin
mailing list