I'm a noob with Kubernetes. I'm trying to follow some recipes to get a small cluster up and running, but I'm having troubles ...
I have a master and (4) nodes, all running Ubuntu 16.04
installed docker on all nodes:
$ sudo apt-get update
$ sudo apt-get install -y \
apt-transport-https \
ca-certificates \
curl \
software-properties-common
$ sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
$ sudo add-apt-repository \
"deb https://download.docker.com/linux/$(. /etc/os-release; echo "$ID") \
$(lsb_release -cs) \
stable"
$ sudo apt-get update && apt-get install -y docker-ce=$(apt-cache madison docker-ce | grep 17.03 | head -1 | awk '{print $3}')
$ sudo docker version
Client:
Version: 17.12.1-ce
API version: 1.35
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:17:40 2018
OS/Arch: linux/amd64
Server:
Engine:
Version: 17.12.1-ce
API version: 1.35 (minimum version 1.12)
Go version: go1.9.4
Git commit: 7390fc6
Built: Tue Feb 27 22:16:13 2018
OS/Arch: linux/amd64
Experimental: false
turned off swap on all nodes
$ sudo swapoff -a
commented out the swap mounts in /etc/fstab
$ sudo vi /etc/fstab
$ mount -a
installed kubeadm & kubectl on all nodes:
$ sudo curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
$ sudo cat <<EOF >/etc/apt/sources.list.d/kubernetes.list
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
$ sudo apt-get update
$ sudo apt-get install -y kubeadm kubectl
$ kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.4",
GitCommit:"bee2d1505c4fe820744d26d41ecd3fdd4a3d6546", GitTreeState:"clean",
BuildDate:"2018-03-12T16:21:35Z", GoVersion:"go1.9.3", Compiler:"gc",
Platform:"linux/amd64"}
downloaded and unpacked this into /usr/local/bin on master and all nodes: https://github.com/kubernetes-incubator/cri-tools/releases
installed etcd 3.3.0 on all nodes:
$ sudo groupadd --system etcd
$ sudo useradd --home-dir "/var/lib/etcd" \
--system \
--shell /bin/false \
-g etcd \
etcd
$ sudo mkdir -p /etc/etcd
$ sudo chown etcd:etcd /etc/etcd
$ sudo mkdir -p /var/lib/etcd
$ sudo chown etcd:etcd /var/lib/etcd
$ sudo rm -rf /tmp/etcd && mkdir -p /tmp/etcd
$ sudo curl -L https://github.com/coreos/etcd/releases/download/v3.3.0/etcd- v3.3.0-linux-amd64.tar.gz -o /tmp/etcd-3.3.0-linux-amd64.tar.gz
$ sudo tar xzvf /tmp/etcd-3.3.0-linux-amd64.tar.gz -C /tmp/etcd --strip-components=1
$ sudo cp /tmp/etcd/etcd /usr/bin/etcd
$ sudo cp /tmp/etcd/etcdctl /usr/bin/etcdctl
noted the IP of the master:
$ sudo ifconfig -a eth0
eth0 Link encap:Ethernet HWaddr 1e:00:51:00:00:28
inet addr:172.20.43.30 Bcast:172.20.43.255 Mask:255.255.254.0
inet6 addr: fe80::27b5:3d06:94c9:9d0/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:3194023 errors:0 dropped:0 overruns:0 frame:0
TX packets:3306456 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:338523846 (338.5 MB) TX bytes:3682444019 (3.6 GB)
initialized kubernetes on the master:
$ sudo kubeadm init --pod-network-cidr=172.20.43.0/16 \
--apiserver-advertise-address=172.20.43.30 \
--ignore-preflight-errors=cri \
--kubernetes-version stable-1.9
[init] Using Kubernetes version: v1.9.4
[init] Using Authorization modes: [Node RBAC]
[preflight] Running pre-flight checks.
[WARNING CRI]: unable to check if the container runtime at "/var/run/dockershim.sock" is running: exit status 1
[certificates] Generated ca certificate and key.
[certificates] Generated apiserver certificate and key.
[certificates] apiserver serving cert is signed for DNS names [jenkins-kube- master kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local] and IPs [10.96.0.1 172.20.43.30]
[certificates] Generated apiserver-kubelet-client certificate and key.
[certificates] Generated sa key and public key.
[certificates] Generated front-proxy-ca certificate and key.
[certificates] Generated front-proxy-client certificate and key.
[certificates] Valid certificates and keys now exist in "/etc/kubernetes/pki"
[kubeconfig] Wrote KubeConfig file to disk: "admin.conf"
[kubeconfig] Wrote KubeConfig file to disk: "kubelet.conf"
[kubeconfig] Wrote KubeConfig file to disk: "controller-manager.conf"
[kubeconfig] Wrote KubeConfig file to disk: "scheduler.conf"
[controlplane] Wrote Static Pod manifest for component kube-apiserver to "/etc/kubernetes/manifests/kube-apiserver.yaml"
[controlplane] Wrote Static Pod manifest for component kube-controller-manager to "/etc/kubernetes/manifests/kube-controller-manager.yaml"
[controlplane] Wrote Static Pod manifest for component kube-scheduler to "/etc/kubernetes/manifests/kube-scheduler.yaml"
[etcd] Wrote Static Pod manifest for a local etcd instance to "/etc/kubernetes/manifests/etcd.yaml"
[init] Waiting for the kubelet to boot up the control plane as Static Pods from directory "/etc/kubernetes/manifests".
[init] This might take a minute or longer if the control plane images have to be pulled.
[apiclient] All control plane components are healthy after 37.502640 seconds
[uploadconfig] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[markmaster] Will mark node jenkins-kube-master as master by adding a label and a taint
[markmaster] Master jenkins-kube-master tainted and labelled with key/value: node-role.kubernetes.io/master=""
[bootstraptoken] Using token: 6be4b1.9a8dacf89f71e53c
[bootstraptoken] Configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstraptoken] Configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstraptoken] Configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[bootstraptoken] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
[addons] Applied essential addon: kube-dns
[addons] Applied essential addon: kube-proxy
Your Kubernetes master has initialized successfully!
To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
You should now deploy a pod network to the cluster.
Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
https://kubernetes.io/docs/concepts/cluster-administration/addons/
You can now join any number of machines by running the following on each node
as root:
kubeadm join --token 6be4b1.9a8dacf89f71e53c 172.20.43.30:6443 --discovery-token-ca-cert-hash sha256:524d29b032d7bfd319b147ab03a936bd429805258425bccca749de71bcb1efaf
on the master node:
$ sudo cp /etc/kubernetes/admin.conf $HOME/.kube/config
$ sudo chown $(id -u):$(id -g) $HOME/.kube/config
$ export KUBECONFIG=$HOME/.kube/config
$ echo "export KUBECONFIG=$HOME/.kube/config" | tee -a ~/.bashrc
setup flannel for networking on master:
$ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
clusterrole "flannel" created
clusterrolebinding "flannel" created
serviceaccount "flannel" created
configmap "kube-flannel-cfg" created
daemonset "kube-flannel-ds" created
$ sudo kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/k8s-manifests/kube-flannel-rbac.yml
clusterrole "flannel" configured
clusterrolebinding "flannel" configured
join the nodes to the cluster running this on each:
$ sudo kubeadm join --token 6be4b1.9a8dacf89f71e53c 172.20.43.30:6443 \
--discovery-token-ca-cert-hash sha256:524d29b032d7bfd319b147ab03a936bd429805258425bccca749de71bcb1efaf \
--ignore-preflight-errors=cri
installed the dashboard on the master:
$ kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/master/src/deploy/recommended/kubernetes-dashboard.yaml
secret "kubernetes-dashboard-certs" created
serviceaccount "kubernetes-dashboard" created
role "kubernetes-dashboard-minimal" created
rolebinding "kubernetes-dashboard-minimal" created
deployment "kubernetes-dashboard" created
service "kubernetes-dashboard" created
started the proxy:
$ kubectl proxy
Starting to serve on 127.0.0.1:8001
opened another ssh to master with -L 8001:127.0.0.1:8001 and opened a local browser window for http://localhost:8001/ui
it redirects to http://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/ and says:
{
"kind": "Status",
"apiVersion": "v1",
"metadata": {
},
"status": "Failure",
"message": "no endpoints available for service \"https:kubernetes- dashboard:\"",
"reason": "ServiceUnavailable",
"code": 503
}
checking the pods ...
$ sudo kubectl get pods --all-namespaces
NAMESPACE NAME READY STATUS RESTARTS AGE
default guids-74487d79cf-zsj8q 1/1 Running 0 4h
kube-system etcd-jenkins-kube-master 1/1 Running 1 21h
kube-system kube-apiserver-jenkins-kube-master 1/1 Running 1 21h
kube-system kube-controller-manager-jenkins-kube-master 1/1 Running 2 21h
kube-system kube-dns-6f4fd4bdf-7pr9q 3/3 Running 0 1d
kube-system kube-flannel-ds-pvk8m 1/1 Running 0 4h
kube-system kube-flannel-ds-q4fsl 1/1 Running 0 4h
kube-system kube-flannel-ds-qhxn6 1/1 Running 0 21h
kube-system kube-flannel-ds-tkspz 1/1 Running 0 4h
kube-system kube-flannel-ds-vgqsb 1/1 Running 0 4h
kube-system kube-proxy-7np9b 1/1 Running 0 4h
kube-system kube-proxy-9lx8h 1/1 Running 1 1d
kube-system kube-proxy-f46d8 1/1 Running 0 4h
kube-system kube-proxy-fdtx9 1/1 Running 0 4h
kube-system kube-proxy-kmnjf 1/1 Running 0 4h
kube-system kube-scheduler-jenkins-kube-master 1/1 Running 1 21h
kube-system kubernetes-dashboard-5bd6f767c7-xf42n 0/1 CrashLoopBackOff 53 4h
checking the log ...
$ sudo kubectl logs kubernetes-dashboard-5bd6f767c7-xf42n --namespace=kube-system
2018/03/20 17:56:25 Starting overwatch
2018/03/20 17:56:25 Using in-cluster config to connect to apiserver
2018/03/20 17:56:25 Using service account token for csrf signing
2018/03/20 17:56:25 No request provided. Skipping authorization
2018/03/20 17:56:55 Error while initializing connection to Kubernetes apiserver.
This most likely means that the cluster is misconfigured (e.g., it has invalid
apiserver certificates or service accounts configuration) or the
--apiserver-host param points to a server that does not exist.
Reason: Get https://10.96.0.1:443/version: dial tcp 10.96.0.1:443: i/o timeout
Refer to our FAQ and wiki pages for more information: https://github.com/kubernetes/dashboard/wiki/FAQ
I find this reference to 10.96.0.1 rather odd. I don't have that on my network anywhere that I'm aware of.
I put the output of sudo kubectl describe pod --namespace=kube-system
on pastebin:
https://pastebin.com/cPppPkRw
Thanks in advance for any pointers.
-Steve Maring
Orlando, FL
Conclusion. CrashLoopBackOff is a Kubernetes state representing a restart loop that is happening in a Pod: a container in the Pod is started, but crashes and is then restarted, over and over again. Kubernetes will wait an increasing back-off time between restarts to give you a chance to fix the error.
CrashLoopBackOff is a status message that indicates one of your pods is in a constant state of flux—one or more containers are failing and restarting repeatedly. This typically happens because each pod inherits a default restartPolicy of Always upon creation. Always-on implies each container that fails has to restart.
--service-cluster-ip-range=10.96.0.0/12
Line 76 of your pastebin shows the Service CIDR to be that, which squares with how kubernetes thinks of the world: .1
in the Service CIDR is always kubernetes (IIRC kube-dns
gets a pretty low IP assignment, too, but I can't recall if it is always fixed like the kubernetes one is)
You'll want to either change both the Service and Pod CIDRs to fit within the 10.244.0.0/16 subnet that flannel created as a side-effect of deploying that yaml, or change its ConfigMap
(err, at your peril now that the network has already been pushed into etcd
) to align with the Service and Pod CIDR specified to your apiserver
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With