behaviour of textutil sort has changed

Peter Ring PRI@cddk.dk
Thu Aug 31 07:52:00 GMT 2000


The behaviour of the cygwin port of textutil sort seems to have changed
slightly. I can't say for sure when it happened; I'm trying to find out
if it's to do with the pesky EOL issue.

For the record, I'm running on top of Windows NT 4.0, the current
installation was quite recently updated, all mount points are binary,
and CYGWIN is 'tty binmode ntea ntsec'.

It used to be like that if the input was a file 'sortexample.in' with LF
as EOL:

d5669267~1980_REM
c56aa142~1980_REM~001__Om_revisionsmeddelelser
00a06b52~1980_REM~001__Om_revisionsmeddelelser~Hoved
0001def3~1980_REM~001__Om_revisionsmeddelelser~001
711f9bb6~1980_REM~002__Fonde
00a06b52~1980_REM~002__Fonde~Hoved
0001def3~1980_REM~002__Fonde~001
0001def0~1980_REM~002__Fonde~002
0001def1~1980_REM~002__Fonde~003
0001def6~1980_REM~002__Fonde~004
0001def7~1980_REM~002__Fonde~005
4e4ce819~1979_SD-CIR
7aa92869~1979_SD-CIR~001__Trækgrundlaget
00a06b52~1979_SD-CIR~001__Trækgrundlaget~Hoved
0001def3~1979_SD-CIR~001__Trækgrundlaget~001

this command:

  sort -t~ -k2,2 -k3,3 -k4,4 -o sortexample.out sortexample.in

would produce the following output in sortexample.out:

4e4ce819~1979_SD-CIR
7aa92869~1979_SD-CIR~001__Trækgrundlaget
0001def3~1979_SD-CIR~001__Trækgrundlaget~001
00a06b52~1979_SD-CIR~001__Trækgrundlaget~Hoved
d5669267~1980_REM
c56aa142~1980_REM~001__Om_revisionsmeddelelser
0001def3~1980_REM~001__Om_revisionsmeddelelser~001
00a06b52~1980_REM~001__Om_revisionsmeddelelser~Hoved
711f9bb6~1980_REM~002__Fonde
0001def3~1980_REM~002__Fonde~001
0001def0~1980_REM~002__Fonde~002
0001def1~1980_REM~002__Fonde~003
0001def6~1980_REM~002__Fonde~004
0001def7~1980_REM~002__Fonde~005
00a06b52~1980_REM~002__Fonde~Hoved

BTW, if I run the example on a Linux box, I also get this behaviour.
And now it produces this output:

0001def3~1979_SD-CIR~001__Trækgrundlaget~001
00a06b52~1979_SD-CIR~001__Trækgrundlaget~Hoved
7aa92869~1979_SD-CIR~001__Trækgrundlaget
4e4ce819~1979_SD-CIR
0001def3~1980_REM~001__Om_revisionsmeddelelser~001
00a06b52~1980_REM~001__Om_revisionsmeddelelser~Hoved
c56aa142~1980_REM~001__Om_revisionsmeddelelser
0001def3~1980_REM~002__Fonde~001
0001def0~1980_REM~002__Fonde~002
0001def1~1980_REM~002__Fonde~003
0001def6~1980_REM~002__Fonde~004
0001def7~1980_REM~002__Fonde~005
00a06b52~1980_REM~002__Fonde~Hoved
711f9bb6~1980_REM~002__Fonde
d5669267~1980_REM

It is as if something is now silently implied at the end of each line
('1979_SD-CIRsomething' is larger than '1979_SD-CIR'); it used to be
like that if the input had CR-LF as EOL, I would get the latter
behaviour (which, BTW, is also what will happen on the Linux box if
there's a CR before the LF).

Before I start hacking sort, I'd like to know if this is an intended
change.

Kind regards,
Peter Ring

--
Want to unsubscribe from this list?
Send a message to cygwin-unsubscribe@sourceware.cygnus.com



More information about the Cygwin mailing list