This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC]setlocale() race condition




I think you have all the right pieces in place. Please feel free to post a "final" version to libc-help, I can give it a once over, and then you can post to libc-alpha. The point of libc-help is to start the feedback process earlier and with better results :-)

Cheers,
Carlos.


Carlos
    This is same as my earlier mail ..but with minor changes
So does it look better ?


This problem was noticed with glibc shipped with distro, with older version of glibc 2.5-12, the problem is noticed after 8 hours of testing and application crashed with SIGSEGV. My efforts to replicate the problem with main line glibc was not successful, but I still feel the problem is there even with main line glibc and wish to know your thoughts on this issue
-----------------------------------------------------------------
Explanation:
This problem was noticed during PHP engine development, current implementation calls setlocale() every time a page is requested. The problem is noticed during stress test of this PHP engine
setlocale() is being called on multiple threads. The exact API calls are as follows
....
setlocale(LC_ALL,'C');
....
setlocale(LC_TYPE,'');
....
setlocale(LC_CTYPE,'C');
....
It was observed that after ~8 hours of testing, application crashed at strcmp() call made from setlocale(), when I analyzed the dump it showed that _nl_global_locale.__names[category] pointer was corrupted.
Code analysis showed a window for race - when one thread calls strcmp()(with in setlocale()) with current value of _nl_global_locale.__names[category] passed as argument and another thread goes ahead and frees the string pointer pointed by _nl_global_locale.__names[category].


_nl_global_locale.__names[category] is protected through the lock libc_setlocale_lock, but the lock is taken only while writing to the data and not while reading from the global variable;
This lock is taken in freelocale() and setlocale() functions


Though setlocale() is not on the POSIX.1 list of async-signal safe functions as in section 2.4.3 http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_04.html#tag_02_04
It still needs to be thread safe according to section 2.9.1 in
http://www.opengroup.org/onlinepubs/009695399/functions/xsh_chap02_09.html


-----------------------------------------------------------------
Testing:
Didn't notice any regression with testing. Testing was done under x86-64 box where the application was built as 32bit.
-----------------------------------------------------------------
Fix:
Similar fix, fixed the problem with distro glibc, where __libc_lock_lock() is used instead of __libc_rwlock_rdlock()



Thanks Yeehaw

Index: libc/locale/setlocale.c
===================================================================
--- libc.orig/locale/setlocale.c	2008-03-31 06:07:03.000000000 +0530
+++ libc/locale/setlocale.c	2008-06-12 08:59:45.000000000 +0530
@@ -234,10 +234,14 @@
   if (locale == NULL)
     return (char *) _nl_global_locale.__names[category];
 
+  __libc_rwlock_rdlock (__libc_setlocale_lock);
   if (strcmp (locale, _nl_global_locale.__names[category]) == 0)
+	{
+      __libc_rwlock_unlock (__libc_setlocale_lock);
     /* Changing to the same thing.  */
     return (char *) _nl_global_locale.__names[category];
-
+	}
+      __libc_rwlock_unlock (__libc_setlocale_lock);
   /* We perhaps really have to load some data.  So we determine the
      path in which to look for the data now.  The environment variable
      `LOCPATH' must only be used when the binary has no SUID or SGID

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]