This is the mail archive of the cygwin@sourceware.cygnus.com mailing list for the Cygwin project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: ASCII and BINARY files. Why?


Barry Fishman <bfishman@nirvanah.corp.es.com> wrote:

> Now back to the ASCII/BINARY discussion.  I think we need to follow
> the principle of least astonishment.  When one opens a file and
> sees ^M 's at the end of each line, you can tell right away what is
> going on.

Humans can see that, but the gnu-win32 DLL software cannot.  Does it
strip the ^Ms or not?  It needs to be told somehow.  This is the crux
of the problem.  Probabilistic text-detection (like Perl's -t test)
are are completely unacceptable in an OS-emulation package -- the
least astonishment rule dictates that much.

What we are really wrestling with is how do we tell the DLL whether or
not to strip CRs.  Many of us think we just tell it not to strip them
ever.  I liked Jeff Epler's suggestion in message ID
<Mutt.19970130231214.jepler@craie.inetnebr.com> of how the user can
configure the behavior with glob lists and file prefixes/suffixes (if
we were to agree on that, we could move on to arguing about what the
default configuration should be! :-)

> What is difficult is having to spend hours patching each application to
> to get around unexpected problems with seek addresses and files that
> don't match their expected sizes.

Exactly.  Why make hundreds of changes to separate applications if you
can avoid all that hassle by reversing the simple decision to default
to text-mode?

> I think ANSI created the problem by having the binary/text decision made
> by the application, and not a property of the file.

ANSI C came after both the UNIX and DOS filesystems.  It had no choice
but to accept the underlying behavior of the filesystems (since it was
a language standard not an OS standard).  And if POSIX had tried to
introduce typed files, they'd still be arguing over it.

> Do NT file systems record this information?

No, but they support a file-forking feature called multiple data
streams in which a single file can have multiple separately-named
parts.  An extra data stream could hold meta-information about the
file (such as whether the file contains text or binary data).  But
what about data flowing through pipes?  There's no way to type that
data since it never hits the filesystem.  What if UNIX-style named
pipes are one day supported by gnu-win32 -- does the type of the named
pipe file affect the type of the data flowing through the pipe?  My
brain itches just thinking of the can of worms this opens.  Let this
idea die now ...
--
Francis Litterio                     PGP Key Fingerprint:
franl@world.std.com                  02 37 DF 6C 66 43 CD 2C
http://world.std.com/~franl/         10 C8 B5 8B 57 34 F3 21

-
For help on using this list, send a message to
"gnu-win32-request@cygnus.com" with one line of text: "help".


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]