This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Re: [PATCH] Linux Kernel Markers 0.5 for Linux 2.6.17 (with probe management)
- From: Ingo Molnar <mingo at elte dot hu>
- To: Mathieu Desnoyers <compudj at krystal dot dyndns dot org>
- Cc: Martin Bligh <mbligh at google dot com>, "Frank Ch. Eigler" <fche at redhat dot com>, Masami Hiramatsu <masami dot hiramatsu dot pt at hitachi dot com>, prasanna at in dot ibm dot com, Andrew Morton <akpm at osdl dot org>, Paul Mundt <lethal at linux-sh dot org>, linux-kernel <linux-kernel at vger dot kernel dot org>, Jes Sorensen <jes at sgi dot com>, Tom Zanussi <zanussi at us dot ibm dot com>, Richard J Moore <richardj_moore at uk dot ibm dot com>, Michel Dagenais <michel dot dagenais at polymtl dot ca>, Christoph Hellwig <hch at infradead dot org>, Greg Kroah-Hartman <gregkh at suse dot de>, Thomas Gleixner <tglx at linutronix dot de>, William Cohen <wcohen at redhat dot com>, ltt-dev at shafik dot org, systemtap at sources dot redhat dot com, Alan Cox <alan at lxorguk dot ukuu dot org dot uk>
- Date: Fri, 22 Sep 2006 19:12:24 +0200
- Subject: Re: [PATCH] Linux Kernel Markers 0.5 for Linux 2.6.17 (with probe management)
- References: <20060921160009.GA30115@Krystal> <20060921160656.GA24774@elte.hu> <20060921214248.GA10097@Krystal> <20060922064955.GA4167@elte.hu> <20060922140329.GA20839@Krystal> <20060922165352.GA16476@elte.hu> <20060922171156.GA18363@Krystal>
* Mathieu Desnoyers <compudj@krystal.dyndns.org> wrote:
> * Ingo Molnar (mingo@elte.hu) wrote:
> >
> > * Mathieu Desnoyers <compudj@krystal.dyndns.org> wrote:
> >
> > > > > Then you lose the ability to trace in-kernel minor page faults.
> > > >
> > > > that's wrong, minor pagefaults go through __handle_mm_fault() just as
> > > > much.
> > > >
> > >
> > > Hi Ingo,
> > >
> > > On a 2.6.17 kernel tree :
> >
> > > It seems like a shortcut path that will never call __handle_mm_fault.
> > > This path is precisely used to handle vmalloc faults.
> >
> > yes, but you said "minor fault", not "vmalloc fault".
> >
> > minor faults are the things that happen when a task does read-after-COW
> > or read-mmap-ed-pagecache-page, and they very much go through
> > __handle_mm_fault().
> >
> > vmalloc faults are extremely rare, x86-specific and they are a pure
> > kernel-internal matter. (I'd never want to trace them, especially if it
> > pushes tracepoints into every architecture's page fault handler. I
> > implemented the initial version of them IIRC, but my memory fails
> > precisely why. I think it was 4:4 related, but i'm unsure.)
> >
> > (i now realize that above you said "in-kernel minor faults" - under that
> > you meant vmalloc faults?)
> >
>
> Yes, sorry, my mistake. This kind of fault is not as infrequent as you
> may think, as every newly allocated vmalloc region will cause vmalloc
> faults on every processes on the system that are trying to access
> them. I agree that it should not be a standard event people would be
> interested in.
most of the vmalloc area that is allocated on a typical system are
modules - and they get loaded on bootup and rarely unloaded. Even for
other vmalloc-ed areas like netfilter, the activation of them is during
bootup. So from that point on the number of vmalloc faults is quite low.
(zero on most systems) If you still want to trace it i'd suggest a
separate type of event for it.
(meanwhile i remember why i implemented vmalloc faults to begin with:
during vmalloc() we used to have a for_each_process() over all
kernel-pagetables of tasks to fix up their pagetables. This caused both
high latencies and overhead back in the days when we still were frequent
vmalloc()ers.)
Ingo