If I get an initial "Name or service not known" (EAI_NONAME), the next call to getaddrinfo() seems to go straight to the dns instead of checking the cache first (nscd logs show no lookup attempts, tcpdump shows traffic to DNS server). If the first call succeeds in getting an address, from then on, all getaddrinfo() calls go to nscd first, as expected. I'm compiling against glibc-2.13 for arm linux. In my rc.d, nscd is started before my daemon. nscd is set to disallow shared caches, and maintain a host cache. I am using the nscd from busybox (0.47). nsswitch.conf is set so host checks cache/files/dns. hosts.conf is set to check files/bind. My daemon is calling getaddrinfo(). I have debug logs for nscd running, and they show that the client started to read the DNS response closes with a "Broken Pipe" error. After that it will show GAI attempts from other daemons attempting to use the cache (so I know it's not nscd locked up or anything), but the daemon that got EAI_NONAME never again contacts nscd to do a cache lookup. If I restart the daemon, I get the same behaviour, if the first DNS query times out again. Is there something in glibc that is invalidating my daemon's link to the cache? Is there a way to reconnect my daemon to the cache without restarting it (similar to forcing a resolv.conf re-load via res_init())?

As alk mentions in his comment, retrying <code>getaddrinfo()</code> more than 100 times should force a nscd query. <hr> To understand why, let us take a quick peek into the flow of execution inside getaddrinfo(). <ol> <li><code>getaddrinfo()</code> calls gaih_inet.</li> <li> <code>gaih_inet()</code> performs the following operations on <code>__nss_not_use_nscd_hosts</code> : <ul> <li>Checks whether it is a positive integer?</li> <li>Increments it.</li> <li> Checks whether it exceeds the retry count <code>NSS_NSCD_RETRY</code>? <ul> <li>It attempts to query nscd ONLY if both the above conditions are satisfied. </li> <li>Also upon attempting a query to nscd, the count is immediately reset to zero thereby ignoring nscd for the next <code>NSS_NSCD_RETRY</code> times <code>getaddrinfo()</code> is called.</li> </ul> </li> </ul> </li> <li> Also <code>__nss_not_use_nscd_hosts</code> is modified internally by nscd in the following places <ul> <li>nscd/nscd_gethst_r.c lines 178, 189 -- reset to <code>1</code>.</li> <li>nscd/nscd_getai.c lines 89, 164 -- reset to <code>1</code>.</li> <li>nss/nsswitch.c, line 709 -- set to <code>-1</code> i.e. Disable nscd.</li> </ul> </li> </ol> <blockquote> Based on the above, it can be concluded that <code>getaddrinfo()</code> does NOT query nscd every single time. </blockquote> The internal state of nscd (determined by <code>__nss_not_use_nscd_hosts</code>) decides if <code>getaddrinfo()</code> ends up calling nscd or not. <blockquote> To really force one's way around the 100 retry limitation, one could modify <code>NSS_NSCD_RETRY</code> and rebuild libc to deviate from the standard behaviour. But i am not really sure if this will NOT result in any other unintended regressions. </blockquote> Reference : Patch that introduced the <code>__nss_not_use_nscd_hosts</code> logic in <code>getaddrinfo()</code>.

using getaddrinfo() only checks nscd cache first time if DNS times out

Tags:

c

linux

nss

getaddrinfo

If I get an initial "Name or service not known" (EAI_NONAME), the next call to getaddrinfo() seems to go straight to the dns instead of checking the cache first (nscd logs show no lookup attempts, tcpdump shows traffic to DNS server). If the first call succeeds in getting an address, from then on, all getaddrinfo() calls go to nscd first, as expected.

I'm compiling against glibc-2.13 for arm linux. In my rc.d, nscd is started before my daemon. nscd is set to disallow shared caches, and maintain a host cache. I am using the nscd from busybox (0.47). nsswitch.conf is set so host checks cache/files/dns. hosts.conf is set to check files/bind.

My daemon is calling getaddrinfo().

I have debug logs for nscd running, and they show that the client started to read the DNS response closes with a "Broken Pipe" error.

After that it will show GAI attempts from other daemons attempting to use the cache (so I know it's not nscd locked up or anything), but the daemon that got EAI_NONAME never again contacts nscd to do a cache lookup.

If I restart the daemon, I get the same behaviour, if the first DNS query times out again.

Is there something in glibc that is invalidating my daemon's link to the cache? Is there a way to reconnect my daemon to the cache without restarting it (similar to forcing a resolv.conf re-load via res_init())?

621

asked May 24 '13 18:05

colin.mc

1 Answers

_{As alk mentions in his comment, retrying getaddrinfo() more than 100 times should force a nscd query.}

To understand why, let us take a quick peek into the flow of execution inside getaddrinfo().

getaddrinfo() calls gaih_inet.
gaih_inet() performs the following operations on __nss_not_use_nscd_hosts :
- Checks whether it is a positive integer?
- Increments it.
- Checks whether it exceeds the retry count NSS_NSCD_RETRY?
  - It attempts to query nscd ONLY if both the above conditions are satisfied.
  - Also upon attempting a query to nscd, the count is immediately reset to zero
    thereby ignoring nscd for the next NSS_NSCD_RETRY times getaddrinfo() is called.
Also __nss_not_use_nscd_hosts is modified internally by nscd in the following places
- nscd/nscd_gethst_r.c lines 178, 189
  -- reset to 1.
- nscd/nscd_getai.c lines 89, 164
  -- reset to 1.
- nss/nsswitch.c, line 709
  -- set to -1 i.e. Disable nscd.

Based on the above, it can be concluded that
getaddrinfo() does NOT query nscd every single time.

The internal state of nscd (determined by __nss_not_use_nscd_hosts)
decides if getaddrinfo() ends up calling nscd or not.

To really force one's way around the 100 retry limitation, one could modify NSS_NSCD_RETRY and rebuild libc to deviate from the standard behaviour. But i am not really sure if this will NOT result in any other unintended regressions.

Reference : Patch that introduced the __nss_not_use_nscd_hosts logic in getaddrinfo().

145

answered Oct 05 '22 14:10

TheCodeArtist

Related questions
                            
                                JNI wrapper for C function using SWIG - what should be the typemap?
                            
                                Opaque types allocatable on stack in C
                            
                                Complex numbers passed by-value from C++ to C does not seem to work on powerpc
                            
                                How to use autotools nobase and nodist prefixs together on include_HEADERS
                            
                                How to query amount of allocated memory on Linux (and OSX)?
                            
                                Why is netcat unable to receive the second broadcast message?
                            
                                How do I turn off '*' in multi-line comments in Eclipse?
                            
                                How to send Ctrl-C control character or terminal hangup message to child process?
                            
                                ncurses transparent console background
                            
                                C++ queue with dependencies
                            
                                Is the data in siginfo trustworthy?
                            
                                Is explicitly clearing/zeroing sensitive variables after use sensible?
                            
                                Compiling functional languages to C
                            
                                Are there any C# to C converter tools? [closed]
                            
                                Naming conventions for Ruby C extension developers
                            
                                Properties of 80-bit extended precision computations starting from double precision arguments
                            
                                Convert a statically linked elf binary to dynamically linked
                            
                                How to free json_object?
                            
                                cpack cannot find libraries, target doesn't exist in this directory
                            
                                int promotion: Is the following well-defined?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With