Additional carriage return added by cygwin commands to DOS text files

Vincent Rivière vincent.riviere@freesbee.fr
Wed Oct 7 16:24:00 GMT 2009


ttjqryfbndgdx wrote:
> Note that I don't have the issue with cat.
> bash-3.2$ cat test1 > test2
> bash-3.2$ xxd test2
> 0000000: 6161 610d 0a62 6262 0d0a                 aaa..bbb..

"cat" consider input and output as binary.
So the syntax "cat a > b" is always equivalent as "cp a b".

Now if you think that cat should consider the files as text, telling 
Cygwin to remove CR on input and add them on output:
There is an error on input (the CR are not removed)
and an error on output (they are not added).
The 2 errors cancel themselves, so the result is still good.

> I don't have it with sort used alone :
> bash-3.2$ /usr/bin/sort test1 > test2
> bash-3.2$ xxd test2
> 0000000: 6161 610d 0a62 6262 0d0a                 aaa..bbb..

"sort" open both input and output as text, it is what I call a "good 
text filter", like "more".

> But get it when using sort in a pipe with cat :
> bash-3.2$ cat test1 | /usr/bin/sort > test2
> bash-3.2$ xxd test2
> 0000000: 6161 610d 0d0a 6262 620d 0d0a            aaa...bbb...

"cat" opens test1 in binary: error on input.
The unexpected CRs goes into cat memory, then into the pipe, then into 
the sort memory, then into the output file, where additional CR are 
inserted, because sort use text-mode output.

> But using more instead of cat solves the issue :
> bash-3.2$ more test1 | /usr/bin/sort > test2
> bash-3.2$ xxd test2
> 0000000: 6161 610d 0a62 6262 0d0a                 aaa..bbb..

Same as sort.

test1 is opened in text mode by more, CRs are automatically stripped.
The correct data free of CR goes through "more" memory, the pipe, then 
"sort" memory.
Then test2 is opened for output in text mode and the CR automagically 
appears.

The key thing to understand is that when text files are opened using 
text mode (as they should always be), the programs never see the CR in 
memory. They are automatically stripped/appended by Cygwin when 
reading/writing into real files. Note that pipes (unlike real files) 
always contain binary data, without CRs.

No mystery (but hard to understand at first).

-- 
Vincent Rivière

--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list