This is the mail archive of the libc-help@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: questions about bug 4737 (fork is not async-signal-safe)



I think I found the corrupted _IO_list_all problem.


It has nothing to do with the earlier mentioned discussion on
libc-hacker
(http://www.sourceware.org/ml/libc-hacker/2007-02/msg00009.html).
This is an application SW design problem which can cause deadlocks.

Nor has it anything to do with the use of fork from within a
signal handler.

The problem is dprintf/dvprintf.
If a multi-threaded application uses fork and dprintf (by different
threads at about the same time) the child process can crash
in fresetlockfiles.

dprintf adds to the global _IO_list_all a temporary
struct _IO_FILE_plus (tmpfil) for which member _lock is NULL.
Here's the code I'm talking about:

 31
 32 int
 33 _IO_vdprintf (d, format, arg)
 34      int d;
 35      const char *format;
 36      _IO_va_list arg;
 37 {
 38   struct _IO_FILE_plus tmpfil;
 39   struct _IO_wide_data wd;
 40   int done;
 41
 42 #ifdef _IO_MTSAFE_IO
 43   tmpfil.file._lock = NULL;
 44 #endif
 45   _IO_no_init (&tmpfil.file, _IO_USER_LOCK, 0, &wd, &_IO_wfile_jumps);
 46   _IO_JUMPS (&tmpfil) = &_IO_file_jumps;
 47   INTUSE(_IO_file_init) (&tmpfil);
 48 #if  !_IO_UNIFIED_JUMPTABLES
 49   tmpfil.vtable = NULL;
 50 #endif
 51   if (INTUSE(_IO_file_attach) (&tmpfil.file, d) == NULL)
 52     {
 53       INTUSE(_IO_un_link) (&tmpfil);
 54       return EOF;
 55     }
 56   tmpfil.file._IO_file_flags =
 57     (_IO_mask_flags (&tmpfil.file, _IO_NO_READS,
 58                      _IO_NO_READS+_IO_NO_WRITES+_IO_IS_APPENDING)
 59      | _IO_DELETE_DONT_CLOSE);
 60
 61   done = INTUSE(_IO_vfprintf) (&tmpfil.file, format, arg);
 62
 63   _IO_FINISH (&tmpfil.file);
 64
 65   return done;
 66 }
"glibc-2.7/libio/iovdprintf.c"

If _IO_file_init returns, adding to _IO_list_all is done
and the list_all_lock is released.
If another thread calls fork at this time (before tmpfil
has been removed from _IO_list_all) the child process
crashes in fresetlockfiles.

The reason it crashes is because fresetlockfiles
re-initializes the file locks by writing to the _lock member
(to some default "_IO_lock_initializer" value). But the
_lock member of the "struct _IO_FILE_plus" coming from dprintf
is NULL.
Here's the code I'm talking about:

 42 static void
 43 fresetlockfiles (void)
 44 {
 45   _IO_ITER i;
 46
 47   for (i = _IO_iter_begin(); i != _IO_iter_end(); i = _IO_iter_next(i))
 48     _IO_lock_init (*((_IO_lock_t *) _IO_iter_file(i)->_lock));
 49 }
"glibc-2.7/nptl/sysdeps/unix/sysv/linux/fork.c"


The chance for this problem to occur is very small. dprintf or vdprintf (_IO_vdprintf) needs to be interrupted after adding tmpfil and before removing it. This is a very tiny window.

I did check the source code of glibc-latest and it seems to be
the problem is still there.

I could anyway simply work around our problem by avoiding
dprintf (we now use sprintf + write(2)).
So now we can happily continue using glibc-2.7 on our
powerpc 32bit platform :-)

---
Norbert van Bolhuis.
AimValley B.V.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]