This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Unified tracing buffer

From: Steven Rostedt <rostedt at goodmis dot org>
To: Roland Dreier <rdreier at cisco dot com>
Cc: Linus Torvalds <torvalds at linux-foundation dot org>, Masami Hiramatsu <mhiramat at redhat dot com>, Martin Bligh <mbligh at google dot com>, Linux Kernel Mailing List <linux-kernel at vger dot kernel dot org>, Thomas Gleixner <tglx at linutronix dot de>, Mathieu Desnoyers <compudj at krystal dot dyndns dot org>, darren at dvhart dot com, "Frank Ch. Eigler" <fche at redhat dot com>, systemtap-ml <systemtap at sources dot redhat dot com>
Date: Mon, 22 Sep 2008 21:39:39 -0400 (EDT)
Subject: Re: Unified tracing buffer
References: <33307c790809191433w246c0283l55a57c196664ce77@mail.gmail.com> <48D7F5E8.3000705@redhat.com> <33307c790809221313s3532d851g7239c212bc72fe71@mail.gmail.com> <48D81B5F.2030702@redhat.com> <33307c790809221616h5e7410f5gc37c262d83722111@mail.gmail.com> <48D832B6.3010409@redhat.com> <alpine.LFD.1.10.0809221718100.3265@nehalem.linux-foundation.org> <adaod2f649o.fsf@cisco.com>

On Mon, 22 Sep 2008, Roland Dreier wrote:

>  > Because all it tells you is the ordering of the atomic increment, not of 
>  > the caller. The atomic increment is not related to all the other ops that 
>  > the code that you trace actually does in any shape or form, and so the 
>  > ordering of the trace doesn't actually imply anything for the ordering of 
>  > the operations you are tracing!
> 
> This reminds me of a naive question that occurred to me while we were
> discussing this at KS.  Namely, what does "ordering" mean for events?
> 
> An example I'm all too familiar with is the lack of ordering of MMIO on
> big SGI systems -- if you forget an mmiowb(), then two CPUs taking a
> spinlock and doing writel() inside the spinlock and then dropping the
> spinlock (which should be enough to "order" things) might see the
> writel() reach the final device "out of order" because the write has to
> travel through a routed system fabric.
> 
> Just like Einstein said, it really seems to me that the order of things
> depends on your frame of reference.

In my logdev tracer (see http://rostedt.homelinux.com/logdev) I used an 
atomic counter to keep "order". But what I would say to people what this 
order means, is that order is among multiple traces between multiple CPUS.
That is if you have.

   CPU 1                                CPU 2
trace_point_a                        trace_point_c
trace_point_b                        trace_point_d

If you see in the trace:

trace_point_a
trace_point_c

You really do not know which happened first. Simply because trace_point_c 
could have been hit first, but for interrupts and nmis and what not, 
trace_point_a could have easily been recorded first. But to me, 
trace_points are more like memory barriers.

If I see:

trace_point_c
trace_point_a
trace_point_b
trace_point_d

I can assume that everything before trace_point_c happened before 
everything after trace_point_a, and that all before trace_point_b happened 
before trace_point_d.

One can not assume that the trace points themselves are in order. But you 
can assume that the things outside the trace points are, like memory 
barriers. I have found lots of race conditions with my logdev, and it was 
due to this "memory barrier" likeness to be able to see the races.

Unfortunately, if you are using an out of sync TSC, you lose even the 
memory barrier characteristic of the trace.

-- Steve

References:
- Re: Unified tracing buffer
  - From: Masami Hiramatsu
- Re: Unified tracing buffer
  - From: Martin Bligh
- Re: Unified tracing buffer
  - From: Masami Hiramatsu
- Re: Unified tracing buffer
  - From: Martin Bligh
- Re: Unified tracing buffer
  - From: Masami Hiramatsu
- Re: Unified tracing buffer
  - From: Linus Torvalds
- Re: Unified tracing buffer
  - From: Roland Dreier

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]