I have a K8s cluster created with kubeadm that consists of a master node and two workers.
I am following this documentation article regarding the etcd backup: https://kubernetes.io/docs/tasks/administer-cluster/configure-upgrade-etcd/#backing-up-an-etcd-cluster
I have to use etcdctl to backup the etcd db so I sh into the etcd pod running on the master node to do it from there: kubectl exec -it -n kube-system etcd-ip-x-x-x-x sh
NOTE: The master node hosts the etcd database in this path /var/lib/etcd
which is mounted on the pod as a VolumeMount in /var/lib/etcd
.
Following the doc I run: ETCDCTL_API=3 etcdctl --endpoints 127.0.0.1:2379 snapshot save snapshotdb
and it returns the following error:
Error: rpc error: code = 13 desc = transport: write tcp 127.0.0.1:44464->127.0.0.1:2379: write: connection reset by peer
What is the problem here?
If the majority of etcd members have permanently failed, the etcd cluster is considered failed. In this scenario, Kubernetes cannot make any changes to its current state. Although the scheduled pods might continue to run, no new pods can be scheduled.
By default, kubeadm runs a local etcd instance on each control plane node. It is also possible to treat the etcd cluster as external and provision etcd instances on separate hosts.
A Kubernetes cluster stores all its data in etcd. Any change you make via kubectl create will cause an entry in etcd to be updated. Any node crashing or process dying causes values in etcd to be changed. The set of processes that make up Kubernetes use etcd to store data and notify each other of changes.
I managed to make it work adding the certificates info to the command:
ETCDCTL_API=3 etcdctl --endpoints https://127.0.0.1:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key /etc/kubernetes/pki/etcd/healthcheck-client.key snapshot save ./snapshot.db
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With