Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubernetes DNS lookg not working from worker node - connection timed out; no servers could be reached

I have build new Kubernetes cluster v1.20.1 single master and single node with Calico CNI.

I deployed the busybox pod in default namespace.

# kubectl get pods busybox -o wide
NAME      READY   STATUS    RESTARTS   AGE   IP             NODE        NOMINATED NODE   READINESS GATES
busybox   1/1     Running   0          12m   10.203.0.129   node02   <none>           <none>

 

nslookup not working

kubectl exec -ti busybox -- nslookup kubernetes.default
Server:    10.96.0.10
Address 1: 10.96.0.10

nslookup: can't resolve 'kubernetes.default'

cluster is running RHEL 8 with latest update

followed this steps: https://kubernetes.io/docs/tasks/administer-cluster/dns-debugging-resolution/

nslookup command not able to reach nameserver

# kubectl exec -i -t dnsutils -- nslookup kubernetes.default
;; connection timed out; no servers could be reached

command terminated with exit code 1

resolve.conf file

# kubectl exec -ti dnsutils -- cat /etc/resolv.conf
search default.svc.cluster.local svc.cluster.local cluster.local 
nameserver 10.96.0.10
options ndots:5

DNS pods running

# kubectl get pods --namespace=kube-system -l k8s-app=kube-dns
NAME                      READY   STATUS    RESTARTS   AGE
coredns-74ff55c5b-472vx   1/1     Running   1          85m
coredns-74ff55c5b-c75bq   1/1     Running   1          85m

DNS pod logs

 kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d

Service is defined

# kubectl get svc --namespace=kube-system
NAME       TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)                  AGE
kube-dns   ClusterIP   10.96.0.10   <none>        53/UDP,53/TCP,9153/TCP   86m

**I can see the endpoints of DNS pod**

# kubectl get endpoints kube-dns --namespace=kube-system
NAME       ENDPOINTS                                               AGE
kube-dns   10.203.0.5:53,10.203.0.6:53,10.203.0.5:53 + 3 more...   86m

enabled the logging, but didn't see traffic coming to DNS pod

# kubectl logs --namespace=kube-system -l k8s-app=kube-dns
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d
.:53
[INFO] plugin/reload: Running configuration MD5 = db32ca3650231d74073ff4cf814959a7
CoreDNS-1.7.0
linux/amd64, go1.14.4, f59c03d

I can ping DNS POD

# kubectl exec -i -t dnsutils -- ping 10.203.0.5
PING 10.203.0.5 (10.203.0.5): 56 data bytes
64 bytes from 10.203.0.5: seq=0 ttl=62 time=6.024 ms
64 bytes from 10.203.0.5: seq=1 ttl=62 time=6.052 ms
64 bytes from 10.203.0.5: seq=2 ttl=62 time=6.175 ms
64 bytes from 10.203.0.5: seq=3 ttl=62 time=6.000 ms
^C
--- 10.203.0.5 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 6.000/6.062/6.175 ms

nmap show port filtered

# ke netshoot-6f677d4fdf-5t5cb -- nmap 10.203.0.5
Starting Nmap 7.80 ( https://nmap.org ) at 2021-01-15 22:29 UTC
Nmap scan report for 10.203.0.5
Host is up (0.0060s latency).
Not shown: 997 closed ports
PORT     STATE    SERVICE
53/tcp   filtered domain
8080/tcp filtered http-proxy
8181/tcp filtered intermapper

Nmap done: 1 IP address (1 host up) scanned in 14.33 seconds

If I schedule the POD on master node, nslookup works nmap show port open?

# ke netshoot -- bash
bash-5.0# nslookup kubernetes.default
Server:         10.96.0.10
Address:        10.96.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.96.0.1

 nmap -p 53 10.96.0.10
Starting Nmap 7.80 ( https://nmap.org ) at 2021-01-15 22:46 UTC
Nmap scan report for kube-dns.kube-system.svc.cluster.local (10.96.0.10)
Host is up (0.000098s latency).

PORT   STATE SERVICE
53/tcp open  domain

Nmap done: 1 IP address (1 host up) scanned in 0.14 seconds

Why nslookup from POD running on worker node is not working? how to troubleshoot this issue?

I re-build the server two times, still same issue.

Thanks

SR

Update adding kubeadm config file

# cat kubeadm-config.yaml
---
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
  - system:bootstrappers:kubeadm:default-node-token
  token: abcdef.0123456789abcdef
  ttl: 24h0m0s
  usages:
  - signing
  - authentication
kind: InitConfiguration
nodeRegistration:
  criSocket: unix:///run/containerd/containerd.sock
  taints:
  - effect: NoSchedule
    key: node-role.kubernetes.io/master
  kubeletExtraArgs:
    cgroup-driver: "systemd"
---
apiVersion: kubeadm.k8s.io/v1beta2
kind: ClusterConfiguration
kubernetesVersion: stable
controlPlaneEndpoint: "master01:6443"
networking:
  dnsDomain: cluster.local
  podSubnet: 10.0.0.0/14
  serviceSubnet: 10.96.0.0/12
---
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: "ipvs

"

like image 289
sfgroups Avatar asked Jan 15 '21 22:01

sfgroups


People also ask

How do you check if DNS is working Kubernetes?

Check if the DNS pod is runningUse the kubectl get pods command to verify that the DNS pod is running. NAME READY STATUS RESTARTS AGE ... coredns-7b96bf9f76-5hsxb 1/1 Running 0 1h coredns-7b96bf9f76-mvmmt 1/1 Running 0 1h ... Note: The value for label k8s-app is kube-dns for both CoreDNS and kube-dns deployments.

How DNS resolution works in Kubernetes?

Kubernetes DNS schedules a DNS Pod and Service on the cluster, and configures the kubelets to tell individual containers to use the DNS Service's IP to resolve DNS names. Every Service defined in the cluster (including the DNS server itself) is assigned a DNS name.


1 Answers

First of all, according to the docs - please note that Calico and kubeadm support Centos/RHEL 7+.
In both Calico and kubeadm documentation we can see that they only support RHEL7+.

By default RHEL8 uses nftables instead of iptables ( we can still use iptables but "iptables" on RHEL8 is actually using the kernel's nft framework in the background - look at "Running Iptables on RHEL 8").

9.2.1. nftables replaces iptables as the default network packet filtering framework

I believe that nftables may cause this network issues because as we can find on nftables adoption page:

Kubernetes does not support nftables yet.

Note: For now I highly recommend you to use RHEL7 instead of RHEL8.


With that in mind, I'll present some information that may help you with RHEL8.
I have reproduced your issue and found a solution that works for me.

  • First I opened ports required by Calico - these ports can be found here under "Network requirements".
    As workaround:
  • Next I reverted to the old iptables backend on all cluster nodes, you can easily do so by setting FirewallBackend in /etc/firewalld/firewalld.conf to iptables as described
    here.
  • Finally I restarted firewalld to make the new rules active.

I've tried nslookup from Pod running on worker node (kworker) and it seems to work correctly.

root@kmaster:~# kubectl get pod,svc -o wide
NAME      READY   STATUS    RESTARTS   AGE    IP           NODE      NOMINATED NODE   READINESS GATES
pod/web   1/1     Running   0          112s   10.99.32.1   kworker   <none>           <none>

NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE     SELECTOR
service/kubernetes   ClusterIP   10.99.0.1    <none>        443/TCP   5m51s   <none>
root@kmaster:~# kubectl exec -it web -- bash
root@web:/# nslookup kubernetes.default
Server:         10.99.0.10
Address:        10.99.0.10#53

Name:   kubernetes.default.svc.cluster.local
Address: 10.99.0.1

root@web:/#
like image 138
matt_j Avatar answered Sep 30 '22 21:09

matt_j