grep treating my text files as binary!

Thomas Wolff towo@towo.net
Thu Dec 25 18:41:00 GMT 2014


Am 25.12.2014 um 00:16 schrieb zzapper:
> Eric Blake <eblake@redhat.com> wrote in
> news:549B4258.5050509@redhat.com:
>
>
>> You upgraded grep.  This is an intentional change in behavior in the
>> newest grep.  Work around it by using 'grep -a' or 'LC_ALL=C grep'.
Eric had further written:
> Basically, the POSIX definition of a binary file includes any file that
> is encoded incorrectly for the current locale, and since your current
> locale is (probably) UTF-8 encoding, any file (such as note.html) that
> assumes some other encoding (probably Latin-1 8-bit encoding) will be
> treated as binary unless you request -a or change locales.
zzapper:
> Thanks Eric, just surprised not to see more people bleating about this
> - it resisted my Googling skills!
I actually had complained about this nonsense in the grep bug channel (a
mailing list),
and Eric had responded there, my further reply being pending... so let
me put it here for now;
I've read the POSIX definition of "binary file" that was quoted in the
grep bug already,
and if I remember correctly (or how this is abbreviated here...) it does
not mention character encoding or locale.
In any case the argument is quite artificial since the new behaviour
hits many files that are in fact text files.
Thus it is very undesirable from any reasonable users' point of view,
which should be the guideline for software design rather than dogmatic
locale theories. Therefore I hold the claim that this is a serious flaw
in grep and I hope it will be reverted.
------
Thomas

---
Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft.
http://www.avast.com


--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list