This is the mail archive of the libc-alpha@sources.redhat.com mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Retry mechanism w/ DNS Format Error?


On Wed, 5 Jan 2005, Ulrich Drepper wrote:
Pekka Savola wrote:
Now, apparently the RHL9 resolver code cannot cope with FormErr code quickly, but has to resort to timeouts. (I've been unable to verify this on a newer glibc because it no longer creates the bitstring queries,

Sure they do, if use tell it. Set the ip6-bytestring option in resolv.conf.

OK thanks. I did a test on a newer glibc, glibc-2.3.4-2.fc3, on much the similar setup with 'ip6-bytestring' option on. There are three nameservers in resolv.conf, the first two of which give FormErr and the last one a ServFail:


14:58:11.604156 2001:708:10:40:201:2ff:fe10:85f8.32770 > 2001:708:10:41::1.domain:  21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:11.604926 2001:708:10:41::1.domain > 2001:708:10:40:201:2ff:fe10:85f8.32770:  21734 FormErr- [0q] 0/0/0 (12)
14:58:16.603648 IP 193.166.4.143.32771 > 193.166.4.206.domain:  21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:16.604178 IP 193.166.4.206.domain > 193.166.4.143.32771:  21734 FormErr- [0q] 0/0/0 (12)
14:58:21.604036 IP 193.166.4.143.32772 > 193.166.4.177.domain:  21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:21.604891 IP 193.166.4.177.domain > 193.166.4.143.32772:  21734 ServFail 0/0/0 (44)
14:58:21.605090 2001:708:10:40:201:2ff:fe10:85f8.32772 > 2001:708:10:41::1.domain:  21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:21.605668 2001:708:10:41::1.domain > 2001:708:10:40:201:2ff:fe10:85f8.32772:  21734 FormErr- [0q] 0/0/0 (12)
14:58:26.604272 IP 193.166.4.143.32773 > 193.166.4.206.domain:  21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:26.604754 IP 193.166.4.206.domain > 193.166.4.143.32773:  21734 FormErr- [0q] 0/0/0 (12)
14:58:31.603590 IP 193.166.4.143.32774 > 193.166.4.177.domain:  21734+ PTR? \[x20010708001000400207e9fffe7b0259/128].ip6.arpa. (44)
14:58:31.604285 IP 193.166.4.177.domain > 193.166.4.143.32774:  21734 ServFail 0/0/0 (44)
14:58:31.604440 2001:708:10:40:201:2ff:fe10:85f8.32774 > 2001:708:10:41::1.domain:  12166+ PTR? 9.5.2.0.b.7.e.f.f.f.9.e.7.0.2.0.0.4.0.0.0.1.0.0.8.0.7.0.1.0.0.2.ip6.arpa. (90)
14:58:31.608130 2001:708:10:41::1.domain > 2001:708:10:40:201:2ff:fe10:85f8.32774:  12166 1/2/2 PTR haukka.ipv6.csc.fi. (204)
[...]

It seems to be:
 1) go to the next server until receiving reply or an error like ServFail
 2) when receiving a reply or error that glibc can process, start all
    over again -- ask the same servers for the same QNAME again.
 3) Upon the second reception of ServFail, give up, and try the next QNAME

I'm not sure which would be the appropriate replacement procedure, but this is one way to provide a fix in this case would be to just skip to the next server without delay upon the reception of FormErr, and not cycle through the list of nameservers twice. [it could be argued that doing it twice might make sense if you only have one name server configured, and the format error was due to packet corruption on the wire, but IMHO we shouldn't be doing this because it's commonly agreed that multiple name servers is a good idea.]

And please file any such reply in bugzilla, else it might get lost.

Sorry, I didn't quite understand what you expected me to file into bugzilla..


--
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]