bash and the current locale implementation

Corinna Vinschen corinna-cygwin@cygwin.com
Thu Oct 1 17:13:00 GMT 2009


On Oct  1 15:45, Corinna Vinschen wrote:
> On Oct  1 13:09, Andy Koppe wrote:
> > 2009/10/1 Corinna Vinschen
> > > The charset of the console is
> > > determined by the environment setting of the current application.
> > 
> > Right, but I think it should be determined once at the start of a
> > process. If a process later changes its env variables, that's its own
> > business. It should not have an effect on filenames or console I/O,
> > because it wouldn't on Linux either.
> > 
> > The assumption on Linux is that the environment variables at process
> > startup tell an application what encoding is used in filenames and in
> > its terminal. An application might be calling setlocale() with
> > different charsets a lot of times, e.g. to convert between different
> > encodings, but it would not expect that to have an effect on filenames
> > or the terminal.
> 
> Hmm, that's a convincing argument.  Without changing bash, this would
> also result in tcsh behaving the same as bash and dash.

Ok, I have a small patch which changes the current implementation
so that a switch to another Cygwin-internal charset only works at
process startup.  No setenv/setlocale combination from the application
itself will change the internally used charset.

Basically that's what you see in bash already without the change.
export LANG=de will only have an effect on child processes.

Is that now the feasible behaviour, finally?

Patch below.  The `extern "C" char *setlocale()' call can eventually
go away and be re-enabled in newlib.  If that's ok with everybody,
I'll check it in tomorrow.


Corinna


Index: dcrt0.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/dcrt0.cc,v
retrieving revision 1.363
diff -u -p -r1.363 dcrt0.cc
--- dcrt0.cc	28 Sep 2009 10:43:49 -0000	1.363
+++ dcrt0.cc	1 Oct 2009 17:11:48 -0000
@@ -762,6 +762,8 @@ dll_crt0_0 ()
 void
 dll_crt0_1 (void *)
 {
+  extern char *initial_setlocale ();
+
   if (dynamically_loaded)
     sigproc_init ();
   check_sanity_and_sync (user_data);
@@ -940,7 +942,7 @@ dll_crt0_1 (void *)
      LoadLibrary serialization. */
   ld_preload ();
   /* Set internal locale to the environment settings. */
-  setlocale (LC_CTYPE, "");
+  initial_setlocale ();
   /* Reset application locale to "C" per POSIX */
   _setlocale_r (_REENT, LC_CTYPE, "C");
   if (user_data->main)
Index: syscalls.cc
===================================================================
RCS file: /cvs/src/src/winsup/cygwin/syscalls.cc,v
retrieving revision 1.538
diff -u -p -r1.538 syscalls.cc
--- syscalls.cc	30 Sep 2009 02:11:05 -0000	1.538
+++ syscalls.cc	1 Oct 2009 17:11:50 -0000
@@ -4209,20 +4209,25 @@ internal_setlocale ()
   setenv ("PATH", c_path, 1);
 }
 
-extern "C" char *
-setlocale (int category, const char *locale)
+char *
+initial_setlocale ()
 {
   char old[(LC_MESSAGES + 1) * (ENCODING_LEN + 1/*"/"*/ + 1)];
-  if (locale && !wincap.has_always_all_codepages ())
-    stpcpy (old, _setlocale_r (_REENT, category, NULL));
-  char *ret = _setlocale_r (_REENT, category, locale);
-  if (ret && locale)
+  if (!wincap.has_always_all_codepages ())
+    stpcpy (old, _setlocale_r (_REENT, LC_CTYPE, NULL));
+  char *ret = _setlocale_r (_REENT, LC_CTYPE, "");
+  if (ret)
     {
       if (!(ret = check_codepage (ret)))
-	_setlocale_r (_REENT, category, old);
-      else if (!*locale && strcmp (cygheap->locale.charset,
-				   __locale_charset ()) != 0)
+	_setlocale_r (_REENT, LC_CTYPE, old);
+      else if (strcmp (cygheap->locale.charset, __locale_charset ()) != 0)
 	internal_setlocale ();
     }
   return ret;
 }
+
+extern "C" char *
+setlocale (int category, const char *locale)
+{
+  return _setlocale_r (_REENT, category, locale);
+}

-- 
Corinna Vinschen                  Please, send mails regarding Cygwin to
Cygwin Project Co-Leader          cygwin AT cygwin DOT com
Red Hat



More information about the Cygwin-developers mailing list