This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: Improving libm-test.inc structure and maintenance
- From: "Joseph S. Myers" <joseph at codesourcery dot com>
- To: OndÅej BÃlka <neleai at seznam dot cz>
- Cc: <libc-alpha at sourceware dot org>
- Date: Thu, 9 May 2013 12:18:54 +0000
- Subject: Re: Improving libm-test.inc structure and maintenance
- References: <Pine dot LNX dot 4 dot 64 dot 1305022244550 dot 12072 at digraph dot polyomino dot org dot uk> <20130505130414 dot GA18328 at domone dot kolej dot mff dot cuni dot cz> <Pine dot LNX dot 4 dot 64 dot 1305051340490 dot 16386 at digraph dot polyomino dot org dot uk> <20130505165832 dot GA30896 at domone> <Pine dot LNX dot 4 dot 64 dot 1305052006330 dot 16386 at digraph dot polyomino dot org dot uk> <20130509095721 dot GA28753 at domone dot kolej dot mff dot cuni dot cz>
On Thu, 9 May 2013, Ondrej Bilka wrote:
> On Sun, May 05, 2013 at 08:23:12PM +0000, Joseph S. Myers wrote:
> > On Sun, 5 May 2013, Ondrej Bilka wrote:
> >
> > > You do not have to review if you do following:
> >
> > Tools may be able to use various heuristics to reduce the number of cases
> > presented for human review. That human review is still needed to ensure
> > good, valid bug reports. (Note that Jakub found various bugs in MPFR in
> > his random fma testing. You need to decide what component the bug is in
> > before reporting it.)
>
> Depends on what is found. If it founds only 10 cases in year then
> filtering is not necessary. My main concern is that when testing finds
> new bug (Which can be needle in haystack of existing bugs) then everybody
> forgotten that it took place and did not read logs. Some notification system
> is necessary.
Frankly, we have more need right now - much more need - for people working
on fixing bugs than for systems detecting and filing new bugs that have
not affected any human enough for them to file the bugs. I'd urge working
on fixes for existing bugs in libm or any other part of glibc over new
bug-finding systems, until the number of open bugs is much smaller than at
present.
Few people have been interested in joining me in the patch-a-day goal,
with a reasonable proportion of those patches being bug fixes, for
improving glibc and dealing with the backlog of known issues. Recruit ten
more people who actively and accurately triage new bugs on a day-by-day
basis and work daily on fixing bugs, and your approach of more automatic
reporting to glibc Bugzilla may become more feasible. Without those
people, it's likely to be harmful rather than helpful to glibc development
- even if the new bugs are in fact valid and not duplicates.
Given the extremely limited resources presently spent on bug fixing and
triage, it's important to ensure new bugs reported are of high quality so
those resources are productively spent improving glibc rather than dealing
with poor-quality, incorrect or duplicative bug reports.
> Bugzilla is best place for notification. Second alternative is send mail
> which has higher probability of being ignored.
Any automatic tester should notify *the person running the tester*. That
person should then take responsibility for understanding the notifications
and producing reports on the human window in glibc Bugzilla where there
are genuinely new bugs. It's the responsibility of the person running the
tester to deal with notifications or to find someone to do so, rather than
dumping them directly into Bugzilla without human review. If you don't
have the human resources to review the output of your system and produce
good human bug reports from it, then at most put information on an
external site and a link on the wiki to where people can find those
external reports if they wish to look for new glibc bugs among them - but
it will probably be largely ignored because there are too many *human* bug
reports for the present level of work on bug fixing, even without new
sources of potential bugs.
> > I'm thinking more on the lines of John Regehr's testing of compilers with
> > Csmith. Reporting one bug doesn't wait on other bugs being fixed if it
> > looks to a human that they are different. Failures appearing in different
> > functions may have the same underlying cause, while failures in the same
> > function may have different causes - that's something a human can judge.
> >
> In libm functions are mostly standalone, same underlying cause can
> happen only by pattern which is repeated in code. Then having list of
> functions affected is handy.
>
> I do not quite follow how you use testing with Csmith. Generate random
> expressions and look how functions behave?
See the bugs he's reported to GCC Bugzilla over the years - human bug
reports, with reduced testcases - and his blog, and the papers he's
published about finding bugs through random testing.
Before working on finding glibc bugs through such random testing, it would
be a very good idea to (a) study the existing literature in the area -
such work should be considered as much a piece of potentially publishable
research, as a direct contribution to glibc, and should be approached
accordingly - and (b) pay close attention to what the people who are
actually fixing such bugs as you might hope to find say they find is
useful regarding reporting them, rather than starting from external
assumptions about how you would like to handle reporting bugs, just as
John Regehr has paid attention to reporting bugs in ways that are useful
to the projects to which he reports them (rather than just dumping the
original large, unreduced and unreviewed tests into Bugzilla, for
example).
> > I think automatic bug filing is always a bad idea - an automatic process
> > may produce a list of *candidate* issues, tracked however is convenient,
> > but the human should be in the loop before any such candidate issue
> > becomes an actual bug report in glibc Bugzilla, not just after.
> >
> What about adding separate state for example GENERATED that will not
> show unless asked.
In the absence of more bug triagers and fixers, a completely separate
tracking system should be used for automatically-generated candidate
issues like this, not glibc Bugzilla until a human has reviewed them and
decided they are genuine and new glibc bugs. Again, get more people
working on bug fixing and triage, and the appropriate approaches may
change, but get the extra people contributing *first* before dumping
lower-quality bugs in Bugzilla.
> > Automatic closing of bugs is also a bad idea; a human needs to judge
> > whether the whole issue is genuinely fixed or whether the commit only
> > fixes particular cases and other parts of the same issue remain to fix.
> >
> A test that tests only particular cases is inadequate test. You can not
> decide if issue is fixed with tests that are green before and green
> after. You also do not reliably know if regression happened. Closing
> bug is good way to fix it and make human add additional neccessary data.
Automatic systems are there as the servant of humans, not their master.
"make human add" is fundamentally the wrong idea. If no-one is paying
attention on a particular day when a computer detects that an issue might
be fixed (given that the issue was reported / reviewed as valid by a human
in the first place), the issue should remain open until someone is looking
at it and can review the notification; it should not be quietly closed
without that review. With extra bug reviewers, waiting for human review
is not a burden here. Without extra bug reviewers to notice errors,
closing a bug when it may not be properly fixed is actively destructive
and harmful to glibc.
It's impossible in advance to write a test that covers all cases, because
until the issue has been analyzed and fixed you don't know how many
instances of the issue appear in different places in the code, but it is
possible to write one that covers at least one failing case, with the
understanding that a human will need to check when it starts to pass and
decide if the issue is really fully fixed.
> I plan write something like this but currently do not have that much time.
> I added it to my TODO list and probably will look in freeze.
>
> Everybody would be welcome to join. What are options where to host it?
I suggest Savannah for GNU-related free software projects. But as above,
I advise (a) fixing existing bugs as higher priority than systems to find
new ones; (b) understanding what people who have gone through and fixed
hundreds of bugs in Bugzilla actually find useful and working based on
that experience to optimize things for the people who fix bugs rather than
optimizing for the person running an automatic system to find them; (c)
understanding the existing literature and experiences with random testing,
with a view to possibly making a publishable contribution to that
literature.
If you do not have that much time, any one bug fix is a valuable
contribution to glibc and likely to be much more practical than starting a
substantial research project on random testing. So is triage of existing
bugs to identify if they are valid, non-duplicative and still applicable
to current glibc.
--
Joseph S. Myers
joseph@codesourcery.com