stat file -- cygwin vs. Windows size?

Gary R. Van Sickle
Fri Jun 24 19:12:00 GMT 2005

> > >My suspicion is that stat is counting cr-lf as two
> > characters but the
> > >input routines are treating these as one.
> > >
> > >If the file has about 20 lines, then that's 20 missing 
> characters???
> > 
> > 
> > Yes, this is right.  And yes, this could be the cause of 
> the situation 
> > you're noticing.
> Is there a standard Cygwin 'idiom' or function for dealing 
> with this mismatch, or should I just re-invent the wheel.

As to the former, no, not Cygwin specifically.  The problem appears to be
that SpamAssassin is making the incorrect but all-too-common assumption that
"text file" == "file of 8-bit ASCII characters with '\n' EOL characters".
This is as incorrect as thinking "picture file" == "JPEG file".

Cygwin does have a number of fetures to "bandaid" many such broken Unix
codes, primarily the "text mode mount" feature, but these are just that, a
band-aid, not a fix of the root problem (and in your case (and in fact in a
similar case in mutt), it can't solve the problem).  As others have
indicated, the real and true solution here is to open the file in binary
mode and handle the various EOL chachter combinations in the SpamAssasin
code.  Which, yeah, is unfortunately reinventing a wheel which should have
been "permanently reinvented" in the last century.  But hey, it's only the
first few years of the 21st century, maybe by the 22nd we'll have this whole
CRLF/LF/CR/LFCR thing sorted out.

Gary R. Van Sickle

