This is the mail archive of the systemtap@sources.redhat.com mailing list for the systemtap project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: What is a tapset?

From: Vara Prasad <prasadav at us dot ibm dot com>
To: "Frank Ch. Eigler" <fche at redhat dot com>
Cc: systemtap at sources dot redhat dot com
Date: Fri, 22 Jul 2005 00:16:39 -0400
Subject: Re: What is a tapset?
References: <42CC7851.6060007@us.ibm.com> <y0mvf3ltke7.fsf@toenail.toronto.redhat.com> <42CEF169.1020902@us.ibm.com> <20050712150038.GA29284@redhat.com>

Frank Ch. Eigler wrote:

[...] I would like to mention this notion of tapset code is not only for "C" based tapsets but also valid for script based tapsets.

Maybe, but a parallel mechanism for implementing your style of automagic variables in within the script language only has not been specified.

I am trying to come up with one form for tapset so that it becomes easy for the tapset writers to understand what they need to provide/ enhance the instrumentation capabilities.

If this is a fair characterization, then I believe that this is a bit over-complicated and under-powered for what we need.

Can you explain what is the complication

Jim already did. The business of registration (crossing user/kernel boundaries), declarative requests for target-side variables, copying values in/out via void* buffers.

One of my very initial proposals had a generic API that we could eventually convince kernel developers to accept to the mainline kernel. The goal of that API is to provide a standard way for kernel instrumentation writers to export the values of interest. The problem with that design was instrumentation writers have to write a pack function when they are writing the data to the buffers and also provide an unpack function for the translator to extract the data from the buffer. Another draw back folks pointed out with that approach was it is not very efficient to get few numbers and strings. All these made me to rethink and come up with these void* buffers for individual data values so that kernel developers don't have to write pack and unpack functions.

and what powers that we need to define tapsets this method doesn't give?


Being an implicit mechanism (not directly visible from the end-user
script), it's hard to invoke deliberately.  It may or may not compose
well with others.  As I mentioned before, there is particular
fragility in your scheme's passing of target-space variables to the C
functions, in that probe points must match between the C and script.

I understand your concern.

[...] I am also assuming if there happens to be more than one target_var defined in the probed function translator is smart enough to find the active one and pass it to c_function.

The syntax and behavior of target value accesses has not been built yet, but of course the translator should perform tie-breaking or print warnings/errors.
But all this is independent of how those values are used - that is
whether or not they are passed to some C function.  Some ordinary
script code will want to extract values this way, so the "C tapset"
angle is not relevant.

Like i mentioned above i am trying to see if we can have one common method for both script and "C". I am leaning towards the proposal that defining a probe point is always done using systemtap language for any type of tapset. The second thing is systemtap always generates the handler code for all types of tapsets and users don't directly write the handler themselves. The only difference between the script based and "C" based is, for script based tapset the entire probe handler code is generated by the systemtap translator. For "C" tapsets systemtap will generate some initial probe handler code to get access to the local variables needed by the "C" function and then the call to the "C" function from the handler and then translator will generate the rest of the handler code to do any post processing of the data. With this we will still meet the need to call "C" kernel functions and hopefully we can convince the kernel developers to write and maintain these "C" functions. However we still have to define the API for these "C" functions to send the data back and we have to address the issues of packing and unpacking the data that i mentioned above.

Following will be the skeleton of the handler for "SCRIPT" tapset.

handler_script()
{
generated code to get the data requested by the end user script;
generate the code to do any post processing;
}

Following will be the skeleton of the handler for "C" tapset.

handler_c()
{
generated code to get access to local variables local1, local2, ..;
kernel_suplied_c_function (local1, local2);
generate the code to extract the data exported by kernel_suplied_c_function;
generate the code to do any post processing;
}

[...] This still doesn't address one of the questions you have raised earlier which is "what is the guarantee that $target_var is still valid with the changing versions of the kernel?"


I don't recall raising this as a question.  The translator will have
to search for $target_var for every compilation run.  If the variable
exists for the selected kernel at the context of the given probe
point, then it will work.

I still don't see why this comes up as a distinguishing factor.  Even
in your scheme, you wanted the translator to be able to pass to your C
functions selected target-side variables.  So the code for finding
those variables is the same.

I didn't say this is a distinguishing factor between my scheme and your scheme, all i am mentioning is we have to address that issue. I guess you have answered the above by saying translator will do the type checking between the data passed from the script and what the kernel debug information tells.

Questions: If c_function wants to export more than one variable, how does it do, who allocates memory for those return values

Jim also raised this. Depending on actual needs of instrumentation authors/users, it may be sufficient to support only a very limited variety of signatures and copying/allocation conventions.

I understand we may not know all the API's we have to support but we have to at least start with a few and expand from there depending on the needs.

For example, it might be possible to let these functions write to systemtap globals through the runtime API.

One of the goals i was trying to meet with the "C" tapsets is kernel developers should able to write the "C" functions and compile them independent of the systemtap. To meet that goal, your above runtime API approach requires that API be available in the kernel , which is fine but something that we have to be aware as it may not be easy to convince Kernel maintainers to accept. This again brings up the issue of pack on the "C" function side and unpack on the systemtap side.

Let us c_function is expecting target_var to be of type int but due to kernel changes target_var now become char how does translator know that raises error instead of passing wrong type, in my proposal type information was part of the registration hence we could verify.
But the "we" who does the verification in the latter case is the same
translator that figures out the type of the target variable!  If some
type combination becomes invalid, then the translator's type-checking
logic will start signalling errors in either case.
In the above the only difference between probe point definition and usage or reference is one has alias and other doesn't. I am not very happy with using aliases to differentiate between the two [...]
If others are also unhappy with overloading the "probe" keyword to define
both probes proper as well as aliases, then we can pick a new keyword for
the latter.  Like "alias foo.bar = baz { ... }".

I think Jim also mentioned the same overloading problem in his reply. I would like to see some thing like "export kernel.function("sys_read") " syntax for defining the probe point. Ofcourse we can always alias that to what ever we want users to refer to it like in this example another statement can say alias kernel.syscall("read") = kernel.function("sys_read"). If one wants to define an alias and the probe point at the same time we can allow syntax like export kernel.syscall("read")=kernel.function("sys_read") just like now.

[...] Based on your above response you seem to agree to the basic concept that tapset (both "C" and script) is a probe point and tapset function pair, please let me know if this is not true.
I don't know what this means.  Rather than defining the grand term
"tapset", we can just focus on particular extension mechanisms.  There
are already two or three already implemented or planned:
- automatic inclusion of scripts from a library, to satisfy undefined references in end-user scripts

I think these are kind of auxiliary or helper functions, in my view they don't really belong in a tapset. I would consider this as just a library of scripts.

- probe aliases

These if i understood correctly defines a probe point and also a associated code that provides some data that is valid at that context, i would consider this is as tapset.

- access to target values

I am assuming you are referring to accessing local variables and arguments based on the generated code with the help of elf libraries. I wouldn't consider this itself as a tapset. This is just a mechanism translator uses to generate the handler code.

I believe lot can be accomplished by these alone.  But there is wide
suspicion that we will also need:
- ability to explicitly call into a C function

You are proposing another alternative or additive:

- ability to implicitly call into a C function, to satisfy references to variables scripts
I don't see much point defining a "tapset" artifact as code that
necessarily uses all/any of (say) mechanisms #1, #3, and #5.  Let's
instead simply treat the extension mechanisms individually, and
consider issues with their natural composition.

I disagree with you that we don't need to define what a tapset is. I personally think we the developers of systemtap are not the experts in all the kernel areas, hence we need to get the help of various subsystem experts to write effective instrumentation. If you agree with that then we need to publish a document which describes what they have to do to write an instrumentation scheme. In your above stated methods i don't see any clear definition of what the instrumentation writers have to do.

I think we should restrict tapset term only to provide instrumentation. What i mean by instrumentation is a probe point definition and an associated script/function that exports relevant data at that probe point. I think mixing up all the above concepts like internal features and auxiliary functions into tapset doesn't help specifying what tapset authors have to do.

I would go one step further and say if you are an expert in an area and if you would like to export some data variables that describes the innerworkings of your subsystem here is what you need to write a tapset

1) Define a probe point 2) Write a piece of code let us call it a Handler Statment block(HSB). HSB are used in generating kprobe handler when needed but HSB itself is not the kprobe handler. HSB's can be written in systemtap scirpting language only or they can also make calls to "c" functions.

Here is an example of a tapset function to export the arguments of read system call that is entirely written in systemtap scripting language kernel.function("sys_read") { file_descriptor = $fd; byte_count = $count; }

Here is the same example which exports one additional value using a c function kernel.function("sys_read") { file_descriptor = $fd; byte_count = $count; filename = get_filename_from_fd(fd); }

Ofcourse as i mentioned above we still have to sort out the details of how we get multiple return values from the "C" function.

- FChE

Follow-Ups:
- Re: What is a tapset?
  - From: Frank Ch. Eigler

References:
- What is a tapset?
  - From: Vara Prasad
- Re: What is a tapset?
  - From: Frank Ch. Eigler
- Re: What is a tapset?
  - From: Vara Prasad
- Re: What is a tapset?
  - From: Frank Ch. Eigler

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]