This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Controlling probe overhead


David Smith <dsmith@redhat.com> writes:

> [...] BTW, I had to rework the STP_TIMING code a very small bit to make it
> work correctly with the STP_OVERLOAD code.  The STP_TIMING code was
> storing cycle counts as 32-bit values, where the STP_OVERLOAD code
> wanted 64-bit cycle counts.  The STP_TIMING code now truncates down to
> 32-bits a little later than it did originally.

Note that the current code doesn't (intend to) truncate cycle counts,
just individual samples of the get_cycles() values.


> [...] I've have one stress test (that Frank wrote) that will make a
> RHEL5 system non-responsive.  The system doesn't crash - just
> decides to no longer take any input.  The overload code kills the
> script in less than 3 minutes.

3 minutes is almost certainly too long for a default overload
detection interval.  I would expect something on the order of a few
seconds.


> Note that I haven't implemented the new error probes you and Frank
> discussed.  I'd like to get the current code in (since it is quite
> useful in its current state) before thinking about error probes.

Indeed, they are independent ideas.


> [...]
> +    << "   -O         turn off automatic probe overload handling" << endl

IMO, there is no need for this option.  Overload detection should
always be present, and tunable with the (documented?) -D parameters.
If this code depends on the STP_TIMING stuff in the probe
prologues/epilogues, than most of that code too could be on also,
(with -t just controlling whether the final timing report is printed).


> -  o->newline(1) << "int32_t cycles_atend = (int32_t) get_cycles ();";
> -  // Handle 32-bit wraparound.
> [...]

Perhaps you could excerpt the actual generated overload/timing code
here.  It looks like there may be more being done here than necessary.


> +  o->newline() << "#ifndef STP_OVERLOAD_INTERVAL";
> +  o->newline() << "#define STP_OVERLOAD_INTERVAL 1000000000LL";
> +  o->newline() << "#endif";
> +  o->newline() << "#ifndef STP_OVERLOAD_THRESHOLD";
> +  o->newline() << "#define STP_OVERLOAD_THRESHOLD 500000000LL";
> +  o->newline() << "#endif";

These quantities should probably depend on the processor, so that
overload intervals are measured in units of time rather than cycles.


- FChE


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]