This is the mail archive of the gdb-patches@sources.redhat.com mailing list for the GDB project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFA] sh-sim: free up some room in jump_table


> Would you be willing to specify a performance test that I can
> use, and a test criterion for me to meet?  It might save time,
> given that we seem to have a 24 hour email cycle.

P.S.: I think arith-rand.c should be compiler with optimization (-O2)
to avoid too much of a skew towards memory operations.

The simulator as an ACE_FAST compile-time setting te remove cycle
counting overhead.  It is really only this stripped-down functionality
that we need for compiler correctness regression tests, while the
cycle counts could conceivable be used for future optimizer quality
regression tests...

I expect the effect of code rearrangement to be different for SH2, SH3
and SH4 because of the way the availability of a barrel shifter / floating
point hardware affects the implementation of division.
I don't expect endianness matters for this benchmark on the level
of instruction mix, although it will matter for what the host has
to do to implement byte accesses.  I don't expect the current implementation
to show appreciable differences depending on endianness, although it probably
makes sense to verify this assumption once.
And if you change the handling of the bi-endianness, that might also
change the how sensitive the timings are to endianness.

The actual goal is to avoid regression testing in testing time for
a fully-multilibbed toolchain, i.e. it tests all optimization levels,
for eleven combinations of cpu type and endianness, for C and C++,
and possibly also for objc and fortran.  The trouble is that we have
a lot of context switching, so in addition to a long execution time
there will be a lot of noise, which means you'd have to run the test
several times on a quiet machine to get a reliable median.
This is why I've looked for something simpler to benchmark.
I've picked arith-rand because with the right iteration count setting
(which AFAIR was the default back then), it has an execution time long
enough that variations below one percent could still be accurately measured,
and it has a mix of loops, variable access and arithmetic, at a much more
reasonable scale that dhrystone.  Besides, arith-rand actually accounted
for a few minutes of the c-torture execution time...

So, although the *real* benchmark is the gcc testsuite, I think
it's much more managable to use a smaller model, like arith-rand.c -O2.
If you have an idea for a better model (or enough CPU time to throw at
testing to do an exect testing time regression test :-), that would
be welcome too, of course.
As to the weighting of the execution times, I think it makes sense to
base it on the number of multilib in the test gantlet.
SH1 and SH2 are three of the targets (there is no little endian SH1),
SH3E is two, and SH4 is six (three ABIs times two endiannesses).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]