Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dns.BeginGetHost... methods blocking

So I want to make a lot of DNS queries.

I create (thousands) of Tasks from the Begin/EndGetHostEntry async pair:

var lookupTask = Task.Factory.FromAsync
   ( Dns.BeginGetHostEntry,
     (Func<IAsyncResult, IPHostEntry>) Dns.EndGetHostEntry,
     "google.com", 
     null
   )

then Task.WaitAll for everything to complete. I'm seeing the number of ThreadPool threads increase drastically in response to my requests. If I force the ThreadPool minThreads to 500, the workload is consumed considerably faster. All of this points to blocking in the Dns asynchronous implementation.

If I replace Dns with a managed Dns client, I can consume the same workload with only 1 or 2 threads in the ThreadPool with cpu virtually idling.

The thing is, the Dns implementation is absolutely core to many networking APIs (HttpWebRequest, WebClient, HttpClient), and they all seem to be affected by this issue. If I resolve DNS with a 3rd party library, and make HTTP requests using the IP address as the host in the uri, then alter the Host header to fix the request, I get blistering performance in comparison to anything involving System.Net.Dns.

What's going on here? Have I missed something or is the System.Net.Dns implementation really that bad?

like image 653
spender Avatar asked Jul 14 '12 03:07

spender


People also ask

What is DNS level blocking?

A DNS block is a mechanism that allows you to prevent access to certain web pages on the server. The technology was originally designed to help defend against spam and phishing attacks by blocking known suspicious IP addresses.

What's DNS traffic?

DNS stands for Domain Name System. The pairing of the hostname and the IP address is called a namespace. Monitoring your DNS records helps you insure that the Domain Name System continues to route traffic properly to your websites, services, and electronic communications.

What ports do DNS use?

DNS has always been designed to use both UDP and TCP port 53 from the start 1 , with UDP being the default, and fall back to using TCP when it is unable to communicate on UDP, typically when the packet size is too large to push through in a single UDP packet.


3 Answers

System.Net.Dns uses the windows gethostbyname function for DNS queries and doesn't really have asynchronous functions at all. The BeginGetHostEntry function is basically just a wrapper for a synchronous GetHostEntry invocation on the thread pool.

Last time I had this same problem with slow/synchronous DNS lookups I eventually just used a large ThreadPool to get the job done since not a single built-in windows or .net DNS related function supports proper (parallel) asynchronous execution.

like image 108
Pent Ploompuu Avatar answered Oct 19 '22 22:10

Pent Ploompuu


This may not be a whole answer but:

The DNS resolving within .net, opens a connection to dns, asks a question and closes. The examples for the managed dns client you linked, clearly show, that this library make a connection, and then while that remains open you can make many questions just like doing

nslookup -

>hostname1
>hostname2
...

under dos/unix

Often when opening it can take a while, by making multiple calls to the already open connection you are not having to do the reverselookup on yourself, and itself, and all the other rubbish the connection to the dns server does when it first connects. For example: if the first DNS server on my list is busy, my machine often takes time to resolve to a different server that was available, as a result, if you encountered that each and every time you did a look up under the .net library, you would see a long wait, and so many threads would be needed, and of course bulk up the CPU load, while really doing not a lot.

The implementation isnt "bad" its just not designed for multiple batch jobs. Unless there are calls I missed too.

like image 2
BugFinder Avatar answered Oct 19 '22 22:10

BugFinder


I don't have a dataset of 1000 URLs to test your code with, and requesting the same URL repeatedly should result in hitting the cache (not the DNS server for my network). So please comment as to the success/failure once you test this.

My recommendation for testing this (or any other hypothesis) would be to create a test dataset of 1000 URLs you want to resolve and number them. Then setup some logging (ie: log4net or similar) and write out a statement when each DNS resolution task finishes including the index of the completed task. I believe you will see these 1000 tasks complete somewhat synchronously. Or at least in groups of 2-8 asynchronous results at a time, where all the groups of 2-8 are synchronous.

The reason for that is connection management. Internally .Net will only allow so many concurrent connections to the same endpoint. If you open up 1000 connection to your dns server, only a few will succeed at a time. The rest need to wait until some earlier connections are closed before they can establish another connection to that same endpoint (your DNS server).

There are good reasons for this limitation normally. But for something like DNS which is relatively small amounts of data and relatively low cost to service the request, I'd be ok to open up that limitation up to say 100-200 simultaneous DNS requests.

You can open up this limitation with this configuration:

<configuration>
  <system.net>
    <connectionManagement>
      <add address="*" maxconnection="100"/>
    </connectionManagement>
  </system.net>
</configuration>

MSDN for System.Net.ConnectionManagement

You can specify a specific endpoint address (URL or IP) and the maximum connections to that address. Some load testing applications will just use the wildcard * and 65535 to open it right up for everything.

I suspect that managed DNS implementation is either reusing the same connection to the DNS server or has some internal configuration like the above.

Some more details you might include in your question is whether you are querying a local DNS server on the same physical network or a DNS server from your local ISP, or a public DNS server like OpenDNS. The configuration of those specific DNS servers may impose there own limitations (ISPs may rate limit, I don't know).

like image 1
BenSwayne Avatar answered Oct 19 '22 23:10

BenSwayne