1.3.18: BUG: Piping DOS files to grep (v2.5) doesn't work properly

Igor Pechtchanski pechtcha@cs.nyu.edu
Thu Jan 16 07:32:00 GMT 2003


On Wed, 15 Jan 2003, Stacey Sheldon wrote:

> Mailing list search didn't find this, nor does it appear
> in the FAQ... hopefully this isn't old news to all of you.
>
> Files read from a pipe are treated differently by grep
> than files read directly.  This results in some unexpected
> (by me) behaviour when using grep on files which use
> the a DOS line-end (cr/nl).  This looks like a bug to me.
>
> I'd expect the following commands to have equivalent
> results:
>
>   grep myregex blah
>   grep myregex < blah
>   cat blah | grep myregex
>
> They are equivalent when the regular file blah uses
> Unix line ends, but they differ for a file blahdos which
> uses DOS line ends.  It appears to me as though grep
> is treating its input as binary when reading from a pipe,
> but correctly using "undossify_input()" in other cases.
>
> Here is an example.  I've created two files, blah (nl line-end)
> and blahdos (cr/nl line-end).
>
>    $ cat blah
>    foobarTest
>    $ od -Ax -a blah
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>    $ od -Ax -a blahdos
>    000000   f   o   o   b   a   r   T   e   s   t  cr  nl
>    00000c
>
> These files should match the regex 'Test$' in all cases,
> but grep on blahdos fails for this case:
>
>    $ cat blahdos | grep 'Test$'
>    $
>
> And here's why (not the -v to invert the match so we have
> something to look at):
>
>    $ cat blahdos | grep -v 'Test$' | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  cr  nl
>    00000c
>
> There's still a cr/nl on the output which wouldn't be there if
> grep had interpreted its input as having DOS line ends.  Here's
> what a successful grep of the UNIX line end file looks like:
>
>    $ cat blah | grep 'Test$' | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>
> In fact, if I read the blahdos file in any other way except through
> a pipe, it successfully matches (note the stripped out cr on the output):
>
>    $ grep 'Test$' blahdos | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>    $ grep 'Test$' < blahdos | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>
> Just in case you might think that this has something to do with cat
> (I did), here's the output of cat for each file:
>
>    $ cat blah | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  nl
>    00000b
>    $ cat blahdos | od -Ax -a
>    000000   f   o   o   b   a   r   T   e   s   t  cr  nl
>    00000c
>
> Using head instead of cat gives the same results as well, just to
> completely remove cat from the picture.
>
> I'm currently running these versions of tools on win2k:
>   cygwin     1.3.18-1
>   textutils  2.0.21 (cat, od, head)
>   grep       2.5
>   bash       2.05b.0(8)-release
>
> I also tried this out with cygwin 1.3.17-1 with identical results.
>
> If you need any further information, please cc me directly since I
> don't read the mailing lists very often.
>
> Stacey.

Stacey,

This is not a bug.  This is expected behavior.  For details, read
<http://cygwin.com/cygwin-ug-net/using-cygwinenv.html>.
	Igor
-- 
				http://cs.nyu.edu/~pechtcha/
      |\      _,,,---,,_		pechtcha@cs.nyu.edu
ZZZzz /,`.-'`'    -.  ;-;;,_		igor@watson.ibm.com
     |,4-  ) )-,_. ,\ (  `'-'		Igor Pechtchanski
    '---''(_/--'  `-'\_) fL	a.k.a JaguaR-R-R-r-r-r-.-.-.  Meow!

Oh, boy, virtual memory! Now I'm gonna make myself a really *big* RAMdisk!
  -- /usr/games/fortune


--
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple
Bug reporting:         http://cygwin.com/bugs.html
Documentation:         http://cygwin.com/docs.html
FAQ:                   http://cygwin.com/faq/



More information about the Cygwin mailing list