Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

docker: containers in stacks within EC2 instance do not inherit dns nameserver

I have set up an EC2 instance on AWS.

Have set up my security groups properly so that the instance is able to reach the Internet, e.g.

ubuntu@ip-10-17-0-78:/data$ ping www.google.com
PING www.google.com (216.58.211.164) 56(84) bytes of data.
64 bytes from dub08s01-in-f4.1e100.net (216.58.211.164): icmp_seq=1 ttl=46 time=1.02 ms
64 bytes from dub08s01-in-f4.1e100.net (216.58.211.164): icmp_seq=2 ttl=46 time=1.00 ms

However, when I exec into a container, this is not possible:

root@d1ca5ce50d3b:/app# ping www.google.com
ping: www.google.com: Temporary failure in name resolution

update_1: the connectivity issue has to do with containers being initiated with docker stack deploy, in specific stacks;

When I just start a stand-alone container, connectivity to the Internet is there:

ubuntu@ip-10-17-0-78:/data$ docker run -it alpine:latest /bin/ash
/ # ping www.google.gr
PING www.google.gr (209.85.203.94): 56 data bytes
64 bytes from 209.85.203.94: seq=0 ttl=38 time=1.148 ms
64 bytes from 209.85.203.94: seq=1 ttl=38 time=1.071 ms

update_2: After some investigation, it turns out that:

  • the stand-alone container, does inherit the EC2 instance's dns-nameserver;
  • the containers started via docker stack deploy do not;

i.e. this is from a docker swarm - initiated container:

ubuntu@ip-10-17-0-78:~$ docker exec -it d1ca5ce50d3b bash
root@d1ca5ce50d3b:/app# cat /etc/resolv.conf 
search eu-west-1.compute.internal
nameserver 127.0.0.11
options ndots:0

update_3: Same is the problem when I start the stack with docker-compose instead of docker stack deploy; does not seem to be a swarm - specific issue;

update_4: I have explicitly added the gfile /etc/docker/daemon.json with the following contents:

{
    "dns": ["10.0.0.2", "8.8.8.8"]
}

ubuntu@ip-10-17-0-78:/data$ docker run busybox nslookup google.com Server: 8.8.8.8 Address: 8.8.8.8:53

Non-authoritative answer: Name: google.com Address: 216.58.211.174

*** Can't find google.com: No answer

but lookup still fails:

Any suggestions why this might be hapenning?

like image 684
pkaramol Avatar asked Dec 18 '22 22:12

pkaramol


1 Answers

I just ran into a similar issue. I realize this is 11 months old, but its somewhat difficult to find information on this topic, so I will post information here.

My issue turned out to be that the default subnet for the docker swarm overlay network was overlapping with my vpcs subnet, so the default amazon ec2 dns server (10.0.0.2) in my case was confusing the docker daemon's ip address routing into thinking it was a swarm overlay local service (I think). Anyway, I resolved my issue by changing the default overlay subnet via my stack files networking: section and my docker daemon began resolving the 10.0.0.2 vpc dns server again.

If you put your nodes docker daemon in debug module (on linux /etc/docker/daemon.json, add "debug": true to the json), you can monitor debug output by tailing the log for the daemon on your specific system. If the daemon is running via systemd, journalctl -u docker will give you the logs. -f will follow the logs.

There I found information about the connectivity issues (docker daemon was failing to get in touch with the dns server on 10.0.0.2:54 -- the udp dns port). However, nslookup was working fine on the host OS, the /etc/resolve.conf looked appropriate. The problem was obvious if you used docker exec to get an interactive /bin/sh in one of the running services. nslookup fails for any external domain, and the docker daemon debug logs spit out more "connection refused" type messages regarding 10.0.0.2. After looking around docker support issues for dns resolution for an hour or two, I found a comment stating that the docker swarm virtual networks are assigned addresses based on some defaults, and that sometimes those defaults overlap with how you've set up your local subnets. I reasoned that if they were overlapping with regards to the dns server on my vpc, it might be trying to route the dns packets intra-swarm, instead of resolving to the vpc subnet routing.

like image 107
Josh Avatar answered May 07 '23 00:05

Josh