This is the mail archive of the cygwin mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: grep -P regexp problem


Andriy Sen wrote:
> Below is an example of the problem.
>
> G:\>cat test.s
> a
> 1
> 
> G:\>cat test.s | grep -P "[^0]1"
> a
> 1

This is not cygwin-specific, so it is really OT for this list, that
being said...

grep -P treats the whole input as a single string, and outputs the
line (or lines) containing the match for the pattern.  [^0] matches 
ANYTHING except 0, including linefeeds.

In your case, the [^0] is matching the linefeed preceding the 1.  That
linefeed is considered part of the line "a\n", so that line is
included in the output.  In other words, although it looks like there
are two matches output, in fact there is only one match, and that is
"a\n1\n"

Assuming you wish to match single lines containing a character other 
than 0 followed by a 1, you probably want the pattern to be '[^0\n]1'

It's probably a bit clearer if the test file is a bit bigger:

$ echo -e 'a\n1\n2\n3\n4\n1\n2\n21\n' > test.txt
$ grep -P '[^0]1' test.txt 
a
1
4
1
21

This output contains 3 matches "a\n1\n" "4\n1\n" and "21\n", whereas:

$ grep -P '[^0\n]1' test.txt 
21

only matches single lines with a 1 that follows anything but 0.


Phil
-- 


This email has been scanned by Ascribe PLC using Microsoft Antigen for Exchange.

--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Problem reports:       http://cygwin.com/problems.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]