This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Performance monitoring hardware access for SystemTap
- From: William Cohen <wcohen at redhat dot com>
- To: SystemTAP <systemtap at sources dot redhat dot com>
- Date: Fri, 15 Dec 2006 16:08:01 -0500
- Subject: Performance monitoring hardware access for SystemTap
There is a great desire to use the performance counters on processors
to gain a better understanding what is going on in the code. There
are various mechanisms such as Perfmon2 and Perfctr that provide
infrastructure to access and manage the performance monitoring
hardware. However, these interfaces are not currently in the mainline
kernel. OProfile is a performance monitoring mechanism currently in
the mainlin kernel.
The OProfile kernel code sets up the performance counters to trigger
interrupts on overflow, records the counter that overflowed and
location of the interrupt, and transports this data to a userspace. A
user space daemon processes these samples and tracks where the
interrupts occur.
OProfile handles a variety of performance monitoring hardware, such as
x86-64, ppc64, and most i386 processors. The Oprofile mechanism doe
not accumulate event counts. Proposed modifications would allow code,
e.g. Systemtap to look at these counts.
-have data structure keep track of the number of interrupts per
counter/processor combination (need to make sure that all counters for
a processor adjacent in data structure)
-have have entry in the processor specific struct that reads counter
like pseudo code below and export a call in oprofile-like driver for
that function
u64 read_pmd_counter(int counter)
u64 total;
/* FIXME check counter reasonable value */
high1 = read_int_count(counter, processor);
retry:
low = read_low(counter);
high 2 = read_int_count(counter, processor);
if (high1 != high2) goto retry;
total = adjust_high(high2) + adjust_low(low);
return total;
}
-thinking about exporting the counter information to
/dev/oprofile/stats/cpu[0-9]+/[0-9]+ (but this would be an expensive
way to read the counters)
-systemtap translator
-extract counter information
-generate arguments for userspace setup code
-systemtap runtime
-functions to read the counters
-userspace helper code:
-pick out the appropriate values
and configure the counters appropriately
-pass which counters counting which events to module
Issues:
What happens when more than one probe using the counters?
there could be enough registers to satisfy both scripts
Cost of reading value? computing the total needs to be cheap
doing an arbitrary 64-bit multiply might not be desirable
Portability and maintainability?
Counters are free running and not tied to threads/processes
What happens when measurement taken across processors?
e.g. start on one processor finish on another?