This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
Systemtap benchmarking doc draft
- From: William Cohen <wcohen at redhat dot com>
- To: SystemTAP <systemtap at sources dot redhat dot com>
- Date: Mon, 27 Feb 2006 16:06:25 -0500
- Subject: Systemtap benchmarking doc draft
Here are some thoughts on Benchmarking SystemTap. As Brad Chen would
say "got any tomatoes?", there are certainly things that need to
refined in this document. It is definitely a work in progress. Probably
the most useful feedback would be on the metrics. Do they contain useful
information? Should some be dropped? Should others be added?
-Will
Systemtap Benchmarking
Feb 27, 2006
1. OVERVIEW
Instrumentation to collect measurements perturb the system being
measured. The question is often how much the perturbation is
instroduced. For the kernel developers are very interested knowing the
cost of the instrumentation, so they are able to judge whether the
instrumentation will add unacceptable overhead to the system or
perturb the system so much to make the measurement inaccurate.
This document outlines the suggested set of metrics to measure on
SystemTap's performance, the framework to collect the data, and
presentation of the results.
2. METRICS
The selected metrics should give insight into the cost and overhead of
common operations in SystemTap and supporting software. The SystemTap
users should be able to gauge how much the instrumentation will affect
the system. The metrics also allow the SystemTap developers to monitor
SystemTap for performance regressions.
Kernel micro-metrics:
kprobe, kretprobe, and djprobes costs:
insertion
removal
minimal probe firing
colocated probes
max number probes active
space requirement per probe
SystemTap Language micro metrics (costs of various operations):
cost of associative array operations
Kernel instrumentation limits:
Number of probes that can be registered at a time
Number of kretprobes active
SystemTap limits:
Maximum elements in associative array
Number of actions/steps in probe.
Instrumentation Process costs:
latency from time stap is generated to time instrumentation running
latency to shutdown instrumentation
Profile where time is spend during instrumentation process
Size of instrumentation kernel modules
3. BENCHMARK DATA COLLECTION
For some of the benchmarks data could be collected from the existing
testsuite scripts. However, there some additional benchmarks will be
required to exercise particular aspects of the system.
A variety of mechanisms will be used to collect the benchmark
information. In some cases additional options will need to be passed
to the systemtap translator to list the latency required for different
phases of the translator.
3.1 USING EXISTING TESTSUITE INFRASTRUCTURE
There are existing testsuites for systemtap and the kernel
support. Currently, these are function tests to determine whether some
aspect of the system is working correctly. However, many of the
systemtap scripts would be represenative of what a user might write.
The testsuite scripts could provide code to pass through the
translator to measure things such as the amount of time required to
compile and install a script.
Additional performance tests will be need to be written. However, it
would be useful to fold these into the existing testsuites to expand
the set of tests to run and provide stress testing.
3.2 LATENCY MEASUREMENTS
Latency is one of the most visible metrics for the user of
SystemTap. How long does it take for the various phases of SystemTap
to complete their work and get the instrumentation collecting data?
Recently the SystemTap translator was modified to produce timing
information with the "-v" option. Such as the example below:
$ stap -v -p4 testsuite/buildok/thirteen.stp
Pass 1: parsed user script and 10 library script(s) in 180usr/10sys/289real ms.
Pass 2: analyzed script: 1 probe(s), 3 function(s), 0 global(s) in 570usr/30sys/631real ms.
Pass 3: translated to C into "/tmp/stapZrONwM/stap_2173.c" in 190usr/90sys/302real ms.
Pass 4: compiled C into "stap_2173.ko" in 5520usr/840sys/6125real ms.
It will be a simple task to parse this information to provide the cost
of various phases translation.
3.3 PROCESSOR UTILIZATION
The latency measurements provide a coarse-grained view of how long
each phase is. Profiling with OProfile will provide some insight into
whether there are any hot spots in the SystemTap or the associated
code on the system. On X86 and X86_64 OProfile can provide samples on
most code include the kernel trap handlers.
3.4 MICROBENCHMARKS
The microbenchmarks are listed in section 2. The small scale tests
will need to be added testsuite to measure the cost of specific
operations. Currently, the testsuite does not include tests to measure
these.
4. ARCHIVING THE MEASUREMENTS
There is always going to be the cases of "Your mileage may vary,"
where the mearurements may not apply to exactly to a particular
situation. However, it would be useful to have this information
publically available for reference, conceviably on a SystemTap webpage
on sourceware.org, so people can get a feel for the cost on various
systems. Something like the search for SPEC CPU2000 Results [SPEC06]
would be nice, but at this time it probably would more work that we
can justify at this time. Tracking performance such as done for GCC
performance [Novillo2006] or code-size (CSiBE) [CSiBE2006] to identify
regressions might be more practical
REFERENCES
[CSiBE2006] GCC Code-Size Benchmark Environment (CSiBE), Feb 2006.
http://www.inf.u-szeged.hu/csibe
[Novillo2006] Novillo, Deigo, Performance Tracking for GCC, Feb
2006. /http://people.redhat.com/dnovillo/spec2000/
[SPEC2006] SPEC CPU2000 Results, Feb 2006. http://www.spec.org/cpu2000/results/