Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why dig command is returning only one IP address of google.com?

As we know google has more than one IP addresses. If we use the website https://toolbox.googleapps.com/apps/dig/#A/[email protected] it results more than one IP addresses of google.

If I run the following command then the scenario is different:

gyan@localhost:~/codes/java/net$ dig google.com

; <<>> DiG 9.10.3-P4-Ubuntu <<>> google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 11777
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;google.com.            IN  A

;; ANSWER SECTION:
google.com.     269 IN  A   216.58.220.46

;; Query time: 0 msec
;; SERVER: 10.100.171.1#53(10.100.171.1)
;; WHEN: Fri Nov 04 16:18:07 IST 2016
;; MSG SIZE  rcvd: 55

gyan@localhost:~/codes/java/net$ 

Only one IP address returned which is not same as what returned by above website. This IP address is also changing time to time.

But if I run dig command for amazon.com:

gyan@localhost:~/codes/java/net$ dig amazon.com

; <<>> DiG 9.10.3-P4-Ubuntu <<>> amazon.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55090
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;amazon.com.            IN  A

;; ANSWER SECTION:
amazon.com.     34  IN  A   54.239.26.128
amazon.com.     34  IN  A   54.239.17.7
amazon.com.     34  IN  A   54.239.25.192
amazon.com.     34  IN  A   54.239.25.208
amazon.com.     34  IN  A   54.239.25.200
amazon.com.     34  IN  A   54.239.17.6

;; Query time: 74 msec
;; SERVER: 127.0.1.1#53(127.0.1.1)
;; WHEN: Fri Nov 04 16:23:17 IST 2016
;; MSG SIZE  rcvd: 135

gyan@localhost:~/codes/java/net$ 

These 6 IP addresses never change for amazon.com. And the website https://toolbox.googleapps.com/apps/dig/#A/[email protected] also returns same 6 IP addresses.

My doubt is how DNS lookup for the google.com is different from the amazon.com? Why google results in just one record, not more than one like amazon?

like image 998
my name is GYAN Avatar asked Nov 04 '16 10:11

my name is GYAN


People also ask

Why does Google have more than one IP address?

www.google.com and google.com are two different hostnames. They have different A records in DNS and they can have different IP addresses, or multiple IP addresses assigned to them. There is no rule that www and non-www hostnames need to go to the same server or IP.

What is dig in IP address?

“dig” is a robust command-line tool developed by BIND for querying DNS nameservers. It can identify IP address records, record the query route as it obtains answers from an authoritative nameserver, diagnose other DNS problems.

What does the dig command show?

The dig (domain information groper) command is a flexible tool for interrogating DNS name servers. It performs DNS lookups and displays the answers that are returned from the queried name server(s).


1 Answers

As we know google has more than one IP addresses.

True. But that doesn't mean a client needs to know more than one of them.

In the past it was quite common for services to respond with multiple IP addresses to a DNS query, in order to perform load balancing. The replies would be often randomized, meaning that a client would get a random subset of a few addresses out of a large pool of addresses of servers that all behaved identically.

DNS based load balancing has always been a hack. It has problems due to caching: if an ISP's DNS resolver is caching the reply, a large number of users would all connect to those few IP addresses, reducing the effectiveness of load balancing. The workaround for this is to lower the TTL of the records, to make sure the entries stay in the cache for a small amount of time, after which a new query is performed. For example, the TTL is set to 34 seconds for amazon.com A records in the dig output you have posted.

The method doesn't work very well. Reducing the TTL further causes increases in latency for small HTTP requests. Also the outcome of DNS load balancing is a bit unpredictable, in the sense that it doesn't guarantee that the servers will handle the load uniformly.

However if you have a load balancer that works at the network level (think of it as a reverse NAT box: one IP is facing the internet, multiplexing traffic to a large number of servers behind it) that can handle a lot of connections, and also has a good uptime, there is no need to do load balancing at the DNS level.

So it is likely that the Google datacenters you connect to do not use DNS load balancing, while the Amazon ones do.

The other question is why if you query 8.8.8.8 from the toolbox you get multiple addresses, while querying from your machine you get only one.

Firstly, it's important to understand that when querying from the toolbox, it's the web server that sends the DNS query, not your computer.

DNS servers do not have to return identical replies to different clients. Actually it is common to return different replies based on the geographical location of the client: for example, if a user in Europe queries google.com, it would get an IP address for a datacenter in Europe, not the US.

In this case I think DNS geolocation is used for some clients, and not for the others. It might have something to do with the size of the network from which the query is sent, and the capacity of the network load balancer. For example if the load balancer can handle 1,000,000 simultaneous connections, and the network from which you send the query has 100,000 IPs, there is no need to do DNS load balancing. But if the network is large (in your example, the size of the datacenter running the toolbox, the network load balancer might not be able to handle it, so DNS load balancing is enabled and you get multiple random IP addresses from a pool.

Note: by "network" I mean the set of machines that all use the same DNS resolver.

Another reason to return multiple IP addresses is to do DNS-based failover. When one of the machines stops working, the client tries to connect to another one. But that's not a great way of doing failover, since some applications do not store all IP addresses (although I think most browsers do) and again, DNS caches get in the way.

like image 65
o9000 Avatar answered Oct 14 '22 05:10

o9000