This is the mail archive of the
systemtap@sourceware.org
mailing list for the systemtap project.
RE: sort a foreach on a stat value?
- From: "Stone, Joshua I" <joshua dot i dot stone at intel dot com>
- To: <fche at redhat dot com>
- Cc: <systemtap at sources dot redhat dot com>
- Date: Fri, 13 Jan 2006 00:14:33 -0800
- Subject: RE: sort a foreach on a stat value?
fche@redhat.com wrote:
>> Do we have a syntax to sort a foreach based on a stat value? [...]
>
> Not at this time.
One thing I've noticed is that our foreach syntax has different
semantics than other languages, and this might be hampering us a bit.
In at least perl and bash, the only iterator variable is the value, but
our iterator variables are keys. This prevents us from doing something
like:
foreach (cnt- in @count(stat)) {...}
But there is value in knowing the keys as well. Perhaps what we need is
to allow specifying a value iterator as well, so we can have:
foreach ([key1, key2, value+] in mymap) {...}
... which also prevents the cost of re-indexing the map. And perhaps
with stats, it could be something like:
foreach ([tid, c=@count-, a=@avg++, h=@hist_log] in mystats)
{...}
For more semantic clarity, perhaps a semicolon would divide the key
iterators from the value/stat iterators. And of course, sorting by a
histogram should not be allowed.
Just some ideas, comments encouraged...
> Since the reporting phase does not need to be as "scalable" (fast) as
> the accumulation phase, perhaps one could do this thusly today:
>
> global stat, stat_counts
> probe my.event { stat[tid()]<<<1 }
> probe end {
> foreach (tid in stat) // sort by value -> ???
> stat_counts[tid] = @count(stat[tid])
> foreach (tid in stat_counts-)
> printf("%d: %d\n", tid, stat_counts[tid]) # and/or
> @avg(stat[tid])) etc. }
This is a passable workaround, yes. The downside is that if stat were
very large, I would have to fudge with the maxaction counter. If I was
only interested in maybe the top 20, then with a single loop construct
it's easy to break out after 20 and not hit the MAXACTION boundary.
With two loops, I have to iterate the entire map the first time.
>> Along the same lines, it would be extremely useful to be able to do
>> "cascading" sort - i.e. sort by more than one field.
>
> If the runtime provided such an facility, one might imagine the
> script syntax exposing it thusly:
> foreach ([x1+, x2--, y2+++] in array----) { ... }
> encoding the cascading order in the length of those +/- suffixes.
That's not a bad suggestion, though I think it's not obvious in which
order the cascading happens. Does + sort before ++? Or does more
length imply higher sorting precedence? I think you mean the former...
Josh