Encoding of German 'umlauts' - please explain

Ronald Fischer ronaldf@eml.cc
Thu Sep 24 08:20:00 GMT 2009


Maybe someone could enlighten me about the following:

On Cygwin bash I see

$ echo ü | od -cx
0000000 374  \n
        0afc
0000002

That means, the German letter ü has encoding 0xFC. If I do the same on CMD shell
(the 'od' used here comes from the Gnu Utilities for Windows), I see:

  echo ü | od -cx
0000000 201      \r  \n
        2081 0a0d
0000004

That is, ü is encoded as 0x81. Why is this different?

I am aware that, for historic reason, different encodings exist (the old
DOS encoding, Windows ANSI encoding etc.). I wouldn't have expected those
differences, however, when comparing bash.exe vs. cmd.exe.



--
Problem reports:       http://cygwin.com/problems.html
FAQ:                   http://cygwin.com/faq/
Documentation:         http://cygwin.com/docs.html
Unsubscribe info:      http://cygwin.com/ml/#unsubscribe-simple



More information about the Cygwin mailing list