This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: sort a foreach on a stat value?


fche@redhat.com wrote:
>> Do we have a syntax to sort a foreach based on a stat value?  [...]
>
> Not at this time.

One thing I've noticed is that our foreach syntax has different
semantics than other languages, and this might be hampering us a bit.
In at least perl and bash, the only iterator variable is the value, but
our iterator variables are keys.  This prevents us from doing something
like:

	foreach (cnt- in @count(stat)) {...}

But there is value in knowing the keys as well.  Perhaps what we need is
to allow specifying a value iterator as well, so we can have:

	foreach ([key1, key2, value+] in mymap) {...}

... which also prevents the cost of re-indexing the map.  And perhaps
with stats, it could be something like:

	foreach ([tid, c=@count-, a=@avg++, h=@hist_log] in mystats)
{...}

For more semantic clarity, perhaps a semicolon would divide the key
iterators from the value/stat iterators.  And of course, sorting by a
histogram should not be allowed.

Just some ideas, comments encouraged...


> Since the reporting phase does not need to be as "scalable" (fast) as
> the accumulation phase, perhaps one could do this thusly today:
> 
>  global stat, stat_counts
>  probe my.event { stat[tid()]<<<1 }
>  probe end {
>    foreach (tid in stat) // sort by value -> ???
>      stat_counts[tid] = @count(stat[tid])
>    foreach (tid in stat_counts-)
>      printf("%d: %d\n", tid, stat_counts[tid]) # and/or
>  @avg(stat[tid])) etc. }

This is a passable workaround, yes.  The downside is that if stat were
very large, I would have to fudge with the maxaction counter.  If I was
only interested in maybe the top 20, then with a single loop construct
it's easy to break out after 20 and not hit the MAXACTION boundary.
With two loops, I have to iterate the entire map the first time.


>> Along the same lines, it would be extremely useful to be able to do
>> "cascading" sort - i.e. sort by more than one field.
> 
> If the runtime provided such an facility, one might imagine the
> script syntax exposing it thusly:
>   foreach ([x1+, x2--, y2+++] in array----) { ... }
> encoding the cascading order in the length of those +/- suffixes.

That's not a bad suggestion, though I think it's not obvious in which
order the cascading happens.  Does + sort before ++?  Or does more
length imply higher sorting precedence?  I think you mean the former...


Josh


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]