I am trying to make my kubernetes cluster pull from a container registry service running inside itself. I have kube dns setup, I have a registry deployment and service running. I can resolve the service internal name via host command on the node. I have added --dns
flag to docker daemon with the address of kube dns service. I have kubelet running with --cluster-dns
flag set to the same address as well. Yet somehow this is what I get when I try to create a pod using this registry.
Failed to pull image "kube-registry.kube-system.svc.cluster.local/myuser/myimage": rpc error: code = Unknown desc = Error response from daemon: Get https://kube-registry.kube-system.svc.cluster.local/v1/_ping: dial tcp: lookup kube-registry.kube-system.svc.cluster.local: no such host
Somehow even with kube dns address explicitly given to both dockerd and kubelet, pulling images from the registry service fails because of name resolution. What am I missing?
Question: "I am trying to make my kube-cluster pull from a registry running inside itself." (Note I plan to going to edit the title of your question to clarify slightly / make it easier to search)
Short Answer: You can't*
*Nuanced Answer: Technically it is possible with hacks and a solid understanding of Kubernetes fundamentals. You'll probably want to avoid doing this, unless you have a really good reason, and fully understand my explanation of the fundamental issue and workaround, as this is an advanced use case that will require debugging to force it to work. This is complicated and nuanced enough to make step by step directions difficult, but I can give you a solid idea of the fundamental issue you ran into that makes this challenging, and high level overview guidance on how to pull off what you're trying to do anyways.
Why you can't / the fundamental issue you ran into:
In Kubernetes land 3 networks tend to exist: Internet, LAN, and Inner Cluster Network.
(Resource that goes into more depth: https://oteemo.com/kubernetes-networking-and-services-101/)
AND these 3 networks each have their own DNS / there's 3 layers of DNS.
Here's the gotcha you're running into:
Workaround options / hacks and nuances you can do to force what you're trying to do to work:
Option 1.) (I don't suggest this, has extra difficult chicken and egg issues, sharing for information purposes only)
Host an additional instance of coredns as a LAN facing instance of DNS on Kubernetes, Expose the registry and 2nd instance of coredns to the LAN via explicit NodePorts (using static service manifests so they'll come up with predictable/static NodePorts, vs random NodePorts in the range of 30000 - 32768) so they're routable from the LAN (I suggest NodePorts over LB's here as one less dependency/thing that can go wrong). Have the 2nd instance of coredns use your LAN router/LAN DNS as it's upstream DNS server. Reconfigure the OS to use the LAN facing coredns as it's DNS server.
Option 2.) More reasonable and what trow does:
What is trow: https://thenewstack.io/trow-a-container-registry-to-run-inside-a-kubernetes-cluster/
Proof they use the /etc/hosts method https://github.com/ContainerSolutions/trow/blob/main/QUICK-INSTALL.md
Pay $12 for a some-dns-name.tld
Use Cert Manager Kubernetes Operator or Cert Bot standalone docker container + proof you own the domain to get an https://registry.some-dns-name.tld HTTPS cert from Lets Encrypt Free. And configure your inner cluster hosted registry to use this HTTPS cert.
Expose the registry hosted in the cluster to the LAN using an NodePort service with an explicitly pinned convention based port number, like 32443
Why NodePort and not a LB? There's 3 reason NP is better than LB for this scenario:
1.) Service type LB's implementation differs between Deployment Environment and Kubernetes Distribution, while type NodePort is universal.
2.) If the LB changes you have to update every node's /etc/host file to point to "LB_IP registry.some-dns-name.tld" AND you have to know the LB IP, that isn't always known in advance / which means you'd have to follow some order of operations. If you use service type NodePort you can add the localhost IP entry to every node's /etc/host, so it looks like "127.0.0.1 registry.some-dns-name.tld", it's well known reusable and simplifies order of operations.
3.) If you ever need to change where your cluster is hosted, you can arrange it so you can make the change in 1 centralized location even in scenarios where you have no access to or control over LAN DNS. You can craft services that point to a staticly defined IP or external name (which could exist outside the cluster). and have the NodePort service point to the staticly defined service.
Add "127.0.0.1 registry.some-dns-name.tld" to /etc/hosts of every node in the cluster.
Set your yaml manifests to pull from registry.some-dns-name.tls, or configure containerd/cri-o's registry mirroring feature to map registry.some-dns-name.tld:32443 to whatever entries are being mirrored on your local registry.
There's 2 more solvable chicken and egg problems to deal with. 1st chicken egg problem is that Kubernetes and the registry will both likely need access to container images to even get this far.
Another solution would be to add kube-dns IP to resolv.conf
:
echo "nameserver $(kubectl -n kube-system get svc kube-dns -o jsonpath='{.spec.clusterIP}')" >> /etc/resolv.conf
CoreDNS service is exposed with static IP, so there's no need to keep it updated.
I can confirm it works on Ubunutu 18.04, despite the fact that resolv.conf
is generated by systemd-resolved
. No additional DNS configuration was required. The services available by FQDNs only:
root@dev:~# nslookup harbor.default.svc.cluster.local
;; Got SERVFAIL reply from 127.0.0.53, trying next server
Server: 10.96.0.10
Address: 10.96.0.10#53
Name: harbor.default.svc.cluster.local
Address: 10.109.118.191
;; Got SERVFAIL reply from 127.0.0.53, trying next server
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With