This is the mail archive of the libc-alpha@sourceware.cygnus.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Linux nanosleep investigation.



I've forwarded the initial bug reports to Andrea Arcangeli
<andrea@suse.de> who told me that he's looking into it.  I've CC'ed
him on this email.

Andrea, can you comment on this?

Thanks,
Andreas

>>>>> Kaz Kylheku writes:

 > I have been doing some digging in the nanosleep system call and the
 > timespec <-> jiffies conversion functions.

 > I have the following findings to report:

 > (1) timespec_to_jiffies performs rounding up to the jiffy

 > (2) If HZ is not a divisor of one billion, then the timespec value
 >     { 0, 999999999 } leads to a larger jiffies value than { 1, 0 }.
 >     For example if HZ is 1024, than { 0, 999999999 } converts to 1025
 >     jiffies, whereas { 1, 0 } converts to 1024 jiffies.

 > (3) Converting from jiffy to timespec and back to jiffy recovers the
 >     original jiffy value. This is good!

 > (4) If the nanosleep calls the scheduler and wakes up in the same time tick
 >     period, than the remaining time does not decrease (or is
 >     even inflated by the rounding). 

 > (5) The sys_nanosleep system call adds one jiffy to any non-zero timeout
 >     value. See the line which reads: 

 > 	expire = timespec_to_jiffies(&t) + (t.tv_sec || t.tv_nsec);

 >     This one extra jiffy contributes to the repeated inflation of the remaining
 >     time reported by Kevin Hendricks in his problem report. That's Because the
 >     remaining time is computed from this adjusted expire time.

 > Out of these findings, (4) is the obvious cause of our problems.  There is no
 > way to fix it, and hence the naive __libc_nanosleep(&rem, &rem); algorithm
 > cannot possibly work. (Why didn't I see this obvious fact before?) It is an
 > inherent problem in all relative waits against a quantized clock: short waits
 > look like they are zero length waits. Our sampling of the jiffies value fails
 > to catch transitions; this aliasing makes it look like time is standing still,
 > or moving very slowly.

 > If a thread is flooded with signals while waiting in pthread_cond_timedwait,
 > the only way it will eventually return is if it catches enough clock tick
 > transitions while executing the nanosleep system call. That's assuming
 > that (5) is fixed: (5) means that when the thread is flooded with signal
 > wakeups, it keeps incrementing the remaining time by 1 jiffy on each call.

 > My conclusion: until Linus provides us with a sys_nanosleep_abs() call, we must
 > call gettimeofday(&now) before each nanosleep.



-- 
 Andreas Jaeger
  SuSE Labs aj@suse.de
   private aj@arthur.rhein-neckar.de

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]