This is the mail archive of the
libc-alpha@sourceware.org
mailing list for the glibc project.
Re: On glibc's resolver
Hi Carlos, I appreciate your thorough reply!
On Tue, 25 Dec 2012, Carlos O'Donell wrote:
What we need is a test case with expected and observed behaviour.
Given a test case we can justify or refute the expected or observed
behaviour against relevant standards or prior art.
I'm having trouble to reproduce it without moving between networks, it
could probably be done with some iptables rules (block access to original
servers after launching alpine and change resolv.conf to new servers) but
I'm not an expert on iptables. I'll try to figure this out later.
Practically, the alpine process opened when connected to the University's
network, can't connect to any of Uni's 3 DNS servers since they are
inaccessible from home network. Strace shows many retries-failures.
Instead of rereading /etc/resolv.conf, it repeatedly times out.
Apparently this is a known issues, and a web search reveals discussions from
as early as 2003. I'd appreciate your opinions, I was thinking of writing a
patch but I can't figure out where it should go, alpine or glibc, code or
documentation! Here are the replies I gathered from a web search:
Could you please provide references to the prior discussions so we can
review them also?
Sure, here are some pointers:
Ulrich Drepper proposing the res_init() solution:
http://sourceware.org/bugzilla/show_bug.cgi?id=3675
Pushing the Debian stat() patch to eglibc:
http://www.eglibc.org/archives/patches/msg00778.html
Firefox bug from 2003:
https://bugzilla.mozilla.org/show_bug.cgi?id=214538
Plus various stackoverflow answers proposing a dedicated resolver library
like c-ares or libunbound.
3) Patch glibc to stat() /etc/resolv.conf, checking for changes. Debian,
Ubuntu are patched.
This sounds like the worst possible solution, imposing a penalty on
all applications for a change that is well defined in a higher level.
The penalty should be negligible if stat() happens after the first
timeout, right?
On (2), res_init() is a BSD non-standard function, and its man page doesn't
mention such a purpose. In fact I can't be sure if it's safe to call it
multiple times and I see no guarantee that it will re-initialise the
resolver more than once. If it's the proposed way shouldn't it be mentioned
in both res_init() and getaddrinfo()'s man pages, or otherwise a big warning
that resolv.conf is never reparsed?
This seems like a sensible solution e.g. an API call that guarantees
that the resolver can operate correctly after a network configuration
change.
I haven't reviewed the code in question so I don't actually know if
res_init() is safe to be used this way. Part of your work would be to
look into this and propose the documentation patch and provide
sufficient background to justify the changes.
I'll look into this. I have doubts on whether it's safe to call res_init()
repeatedly on all UNIX systems. Maybe a glibc specific init function would
be better, that could also change (per-process) all the resolv.conf
parameters, e.g. timeout and retries?
And a related question, is there a way to setup resolver behaviour (timeout,
retries) for a process programmatically, instead of changing the system-wide
resolv.conf?
There is no interface for this.
Thanks, I was not sure.
Dimitris