This is the mail archive of the
libc-alpha@sources.redhat.com
mailing list for the glibc project.
Re: Retry mechanism w/ DNS Format Error?
On Wed, 5 Jan 2005, Ulrich Drepper wrote:
Pekka Savola wrote:
Now, apparently the RHL9 resolver code cannot cope with FormErr code
quickly, but has to resort to timeouts. (I've been unable to verify this
on a newer glibc because it no longer creates the bitstring queries,
Sure they do, if use tell it. Set the ip6-bytestring option in resolv.conf.
OK thanks. I did a test on a newer glibc, glibc-2.3.4-2.fc3, on much
the similar setup with 'ip6-bytestring' option on. There are three
nameservers in resolv.conf, the first two of which give FormErr and
the last one a ServFail:
14:58:11.604156 2001:708:10:40:201:2ff:fe10:85f8.32770 > 2001:708:10:41::1.domain: 21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:11.604926 2001:708:10:41::1.domain > 2001:708:10:40:201:2ff:fe10:85f8.32770: 21734 FormErr- [0q] 0/0/0 (12)
14:58:16.603648 IP 193.166.4.143.32771 > 193.166.4.206.domain: 21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:16.604178 IP 193.166.4.206.domain > 193.166.4.143.32771: 21734 FormErr- [0q] 0/0/0 (12)
14:58:21.604036 IP 193.166.4.143.32772 > 193.166.4.177.domain: 21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:21.604891 IP 193.166.4.177.domain > 193.166.4.143.32772: 21734 ServFail 0/0/0 (44)
14:58:21.605090 2001:708:10:40:201:2ff:fe10:85f8.32772 > 2001:708:10:41::1.domain: 21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:21.605668 2001:708:10:41::1.domain > 2001:708:10:40:201:2ff:fe10:85f8.32772: 21734 FormErr- [0q] 0/0/0 (12)
14:58:26.604272 IP 193.166.4.143.32773 > 193.166.4.206.domain: 21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:26.604754 IP 193.166.4.206.domain > 193.166.4.143.32773: 21734 FormErr- [0q] 0/0/0 (12)
14:58:31.603590 IP 193.166.4.143.32774 > 193.166.4.177.domain: 21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:31.604285 IP 193.166.4.177.domain > 193.166.4.143.32774: 21734 ServFail 0/0/0 (44)
14:58:31.604440 2001:708:10:40:201:2ff:fe10:85f8.32774 > 2001:708:10:41::1.domain: 12166+ PTR? 9.5.2.0.b.7.e.f.f.f.9.e.7.0.2.0.0.4.0.0.0.1.0.0.8.0.7.0.1.0.0.2.ip6.arpa. (90)
14:58:31.608130 2001:708:10:41::1.domain > 2001:708:10:40:201:2ff:fe10:85f8.32774: 12166 1/2/2 PTR haukka.ipv6.csc.fi. (204)
[...]
It seems to be:
1) go to the next server until receiving reply or an error like ServFail
2) when receiving a reply or error that glibc can process, start all
over again -- ask the same servers for the same QNAME again.
3) Upon the second reception of ServFail, give up, and try the next QNAME
I'm not sure which would be the appropriate replacement procedure, but
this is one way to provide a fix in this case would be to just skip to
the next server without delay upon the reception of FormErr, and not
cycle through the list of nameservers twice. [it could be argued that
doing it twice might make sense if you only have one name server
configured, and the format error was due to packet corruption on the
wire, but IMHO we shouldn't be doing this because it's commonly agreed
that multiple name servers is a good idea.]
And please file any such reply in bugzilla, else it might get lost.
Sorry, I didn't quite understand what you expected me to file into
bugzilla..
--
Pekka Savola "You each name yourselves king, yet the
Netcore Oy kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings