This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

towards zero FAIL for make installcheck


Hi,

Today I was happily surprised by the following test results:

Host: Linux toonder.wildebeest.org 3.1.2-1.fc16.x86_64 #1 SMP Tue Nov 22
09:00:57 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
Snapshot: version 1.7/0.152 commit release-1.6-522-g6ba4d29
GCC: 4.6.2 [gcc (GCC) 4.6.2 20111027 (Red Hat 4.6.2-1)]
Distro: Fedora release 16 (Verne)

# of expected passes            3124
# of unexpected failures        1
# of unexpected successes       8
# of expected failures          245
# of unknown successes          1
# of known failures             42
# of untested testcases         958
# of unsupported tests          2

So only one FAIL was reported for this run:
FAIL: semok/thirtynine.stp

and 8 unexpected successes were reported:
XPASS: semko/nodwf01.stp
XPASS: semko/nodwf02.stp
XPASS: semko/nodwf03.stp
XPASS: semko/nodwf04.stp
XPASS: semko/nodwf05.stp
XPASS: semko/nodwf06.stp
XPASS: semko/nodwf07.stp
XPASS: semko/nodwf09.stp

Which I haven't investigated yet. But just 9 things that could be
serious issues is a good thing (at least compared to a few weeks ago
when we would have tens of such issues).

In the unexpected good news there was one unknown success:
KPASS: cast-scope-m32 (PRMS 13420)

Which I know does fail on some other architectures.

As you can see the low number of FAILs (unexpected failures) is
compensated by a high number of KFAILs (known failures) and UNTESTED
(untested testcases). The idea behind that is that we would like to see
FAILs only for things that used to PASS and through some regression now
start FAILing. So, if you hack a bit, run make installcheck and see some
FAILs you know you should investigate them.

When you write new tests, or fix some old tests, please follow these
rough guidelines:

- PASS (expected pass), use pass "<testname> <specifics>" to indicate
  that something worked as expected.
- FAIL (unexpected failure), use fail "<testname> <specifics>" to
  indicate that something unexpectedly didn't work as expected.
- XFAIL (expected failure), use xfail "<testname> <specifics>" or
  setup_xfail "<arch-triplet>" followed by a normal pass/fail, to
  indicate something is expected to fail (so this isn't something bad,
  not unexpected, but often it makes sense to invert the test and just
  use PASS).
- XPASS (unexpected success), this is generated when you use setup_xfail
  "<arch-triplet>" and then the test results in a pass "<testname>
  <specifics>". This indicates a problem. The test was expected to
  XFAIL, but unexpectedly passed instead (so this is something bad and
  unexpected, should not happen).
- KFAIL (known failure), use kfail "<testname> <specifics>" or
  setup_kfail "<arch-triplet> <bug number>" followed by a normal
  pass/fail, to indicate something is known to fail and has a
  corresponding bug number in systemtap bugzilla on sourceware.org.
  (so this is something bad, but we know about it and should fix it,
   the bug report will contain more information why it is currently
   failing.)
- KPASS (unknown success), this is generated when you use setup_kfail
  and then the test results in a pass. This indicates that a bug might
  have been fixed (or you were just lucky). Check the corresponding
  bug report to see if this test should really just pass now or whether
  it is just dumb luck this time.
- UNTESTED (untested testcase), use untested "<testname> <specifics>" to
  indicate that the test could have been run, but wasn't because
  something was missing to complete the test (for example a previous
  test failed on which this test depended).
- UNSUPPORTED (unsupported testcase), use unsupported "<testname>
  <specifics>" to indicate that the test just isn't supported on this
  setup (for example if it is for a syscall not available on this
  architecture).

Hope these guidelines make a little sense and will help us to spot
regressions earlier. If we can make sure that the unexpected failures
are zero (or very low) it will be much easier to be sure you patches are
correct (or at least don't introduce regressions).

Cheers,

Mark


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]