Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

using getaddrinfo() only checks nscd cache first time if DNS times out

If I get an initial "Name or service not known" (EAI_NONAME), the next call to getaddrinfo() seems to go straight to the dns instead of checking the cache first (nscd logs show no lookup attempts, tcpdump shows traffic to DNS server). If the first call succeeds in getting an address, from then on, all getaddrinfo() calls go to nscd first, as expected.

I'm compiling against glibc-2.13 for arm linux. In my rc.d, nscd is started before my daemon. nscd is set to disallow shared caches, and maintain a host cache. I am using the nscd from busybox (0.47). nsswitch.conf is set so host checks cache/files/dns. hosts.conf is set to check files/bind.

My daemon is calling getaddrinfo().

I have debug logs for nscd running, and they show that the client started to read the DNS response closes with a "Broken Pipe" error.

After that it will show GAI attempts from other daemons attempting to use the cache (so I know it's not nscd locked up or anything), but the daemon that got EAI_NONAME never again contacts nscd to do a cache lookup.

If I restart the daemon, I get the same behaviour, if the first DNS query times out again.

Is there something in glibc that is invalidating my daemon's link to the cache? Is there a way to reconnect my daemon to the cache without restarting it (similar to forcing a resolv.conf re-load via res_init())?

like image 621
colin.mc Avatar asked May 24 '13 18:05

colin.mc


People also ask

Does nscd cache DNS?

To reduce the load on your DNS infrastructure, it's highly recommended to use the Name Service Caching Daemon (NSCD) on cluster nodes running Linux. This daemon will cache host, user, and group lookups and provide better resolution performance, and reduced load on DNS infrastructure.

How does nscd work?

This is the Name Service Cache Daemon. It takes care of group and password lookups for running programs and then caches the lookup results for the next query for services that can experience slowness in picking up changes such as NIS or LDAP.

Is nscd deprecated?

nscd is already planned for deprecation in Fedora 34. The functionality it currently provides can be achieved by using systemd-resolved for DNS caching and the sssd daemon for everything else.

What is nscd Ubuntu?

DESCRIPTION. Nscd caches libc-issued requests to the Name Service. If retrieving NSS data is fairly expensive, nscd is able to speed up consecutive access to the same data dramatically and increase overall system performance. Nscd should be run at boot time by /etc/init.


1 Answers

As alk mentions in his comment, retrying getaddrinfo() more than 100 times should force a nscd query.


To understand why, let us take a quick peek into the flow of execution inside getaddrinfo().

  1. getaddrinfo() calls gaih_inet.

  2. gaih_inet() performs the following operations on __nss_not_use_nscd_hosts :

    • Checks whether it is a positive integer?
    • Increments it.
    • Checks whether it exceeds the retry count NSS_NSCD_RETRY?

      • It attempts to query nscd ONLY if both the above conditions are satisfied.

      • Also upon attempting a query to nscd, the count is immediately reset to zero
        thereby ignoring nscd for the next NSS_NSCD_RETRY times getaddrinfo() is called.

  3. Also __nss_not_use_nscd_hosts is modified internally by nscd in the following places

    • nscd/nscd_gethst_r.c lines 178, 189
      -- reset to 1.

    • nscd/nscd_getai.c lines 89, 164
      -- reset to 1.

    • nss/nsswitch.c, line 709
      -- set to -1 i.e. Disable nscd.

Based on the above, it can be concluded that
getaddrinfo() does NOT query nscd every single time.

The internal state of nscd (determined by __nss_not_use_nscd_hosts)
decides if getaddrinfo() ends up calling nscd or not.

To really force one's way around the 100 retry limitation, one could modify NSS_NSCD_RETRY and rebuild libc to deviate from the standard behaviour. But i am not really sure if this will NOT result in any other unintended regressions.

Reference : Patch that introduced the __nss_not_use_nscd_hosts logic in getaddrinfo().

like image 145
TheCodeArtist Avatar answered Oct 05 '22 14:10

TheCodeArtist