This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug runtime/10651] very rare BUG_ON kernel/timer.c:619 due to runtime/time.c


------- Additional Comments From jistone at redhat dot com  2009-09-18 20:06 -------
(In reply to comment #0)
> Something is calling mod_timer with a timer->function==NULL.

This seems to indicate either memory corruption, or the memory was freed and
cleared by the next owner.  It seems like particularly poor timing for the
function pointer to have been valid enough to enter the handler but invalid at
the end of the handler.

> It appears as if the _stp_kill_time function is needlessly racy
> (amongst the stp_timer_reregister flag, which should probably be
> an atomic_t), and the del_timer_sync()'s.  It wouldn't hurt to
> plop a synchronize_sched() in there too before the free_percpu
> goo.

Can you get a crash dump?  I'd like to confirm that _stp_kill_time was actually
attempted, possibly by looking at the backtraces on other cpus and checking if
stp_time==NULL.

The promises of del_timer_sync when it returns are that the handler is not
active and the timer is not queued.  I think this actually makes the reregister
flag superfluous.  It should then be perfectly safe to free the memory, unless
the for_each_online_cpu somehow missed one of the timers...


-- 


http://sourceware.org/bugzilla/show_bug.cgi?id=10651

------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]