I have three master nodes with each 80 GB disk size. Recently I ran into this problem:
Normal Pulling 52s (x2 over 6m17s) kubelet, 192.168.10.37 pulling image "gcr.io/kubeflow-images-public/tensorflow-serving-1.8gpu:latest"
Warning Evicted 8s (x5 over 4m19s) kubelet, 192.168.10.37 The node was low on resource: ephemeral-storage.
–> "The node was low on resource: ephemeral-storage."
The storage on the execution node looks like this:
Filesystem Size Used Available Use% Mounted on
overlay 7.4G 5.2G 1.8G 74% /
tmpfs 3.9G 0 3.9G 0% /dev
tmpfs 3.9G 0 3.9G 0% /sys/fs/cgroup
/dev/vda1 7.4G 5.2G 1.8G 74% /opt
/dev/vda1 7.4G 5.2G 1.8G 74% /mnt
/dev/vda1 7.4G 5.2G 1.8G 74% /media
/dev/vda1 7.4G 5.2G 1.8G 74% /home
none 3.9G 1.5M 3.9G 0% /run
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/resolv.conf
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/selinux
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/logrotate.d
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/lib/modules
devtmpfs 3.9G 0 3.9G 0% /host/dev
shm 64.0M 0 64.0M 0% /host/dev/shm
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/lib/firmware
none 3.9G 1.5M 3.9G 0% /var/run
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/docker
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/sbin/xtables-multi
/dev/vda1 7.4G 5.2G 1.8G 74% /var/log
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/hosts
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/hostname
shm 64.0M 0 64.0M 0% /dev/shm
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/bin/system-docker-runc
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/boot2docker
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/docker
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/kubelet
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/bin/ros
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/rancher
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/bin/system-docker
/dev/vda1 7.4G 5.2G 1.8G 74% /usr/share/ros
/dev/vda1 7.4G 5.2G 1.8G 74% /etc/ssl/certs/ca-certificates.crt.rancher
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/rancher/conf
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/rancher/cache
devtmpfs 3.9G 0 3.9G 0% /dev
shm 64.0M 0 64.0M 0% /dev/shm
/dev/vda1 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/0181228584d6531d794879db05bf1b0c0184ed7a4818cf6403084c19d77ea7a0/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/655a92612d5b43207cb50607577a808065818aa4d6442441d05b6dd55cab3229/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/b0d8200c48b07df410d9f476dc60571ab855e90f4ab1eb7de1082115781b48bb/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/f36e7d814dcb59c5a9a5d15179543f1a370f196dc88269d21a68fb56555a86e4/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/842157b72f9155a86d2e4ee2547807c4a70c06320f5eb6b2ffdb00d2756a2662/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/cee5e99308a13a32ce64fdb853d2853c5805ce1eb71d0c050793ffaf8a000db9/merged
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/6ee5a7ad205bf24f1795fd9374cd4a707887ca2edd6f7e1b4a7698f51361966c/shm
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/79decf02c3a0eb6dd681c8f072f9717c15ba17fcb47d693fcfa1c392b3aef002/shm
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/acc7d374f838256762e03aea4378b73de7a38c97b07af77d62ee01135cc1377b/shm
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/46cb89b550bb1d5394fcbd66d2746f34064fb792a4a6b14d524d4f76a1710f7e/shm
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/0db3a0057c9194329fbacc4d5d94ab40eb2babe06dbb180f72ad96c8ff721632/shm
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/6c17379244983233c7516062979684589c24b661bc203e6e1d53904dd7de167f/shm
tmpfs 3.9G 12.0K 3.9G 0% /opt/rke/var/lib/kubelet/pods/ea5b0e7d-18d6-11e9-86c9-fa163ebea4e5/volumes/kubernetes.io~secret/canal-token-gcxzd
tmpfs 3.9G 12.0K 3.9G 0% /opt/rke/var/lib/kubelet/pods/eab6dac4-18d6-11e9-86c9-fa163ebea4e5/volumes/kubernetes.io~secret/cattle-token-lbpxh
tmpfs 3.9G 8.0K 3.9G 0% /opt/rke/var/lib/kubelet/pods/eab6dac4-18d6-11e9-86c9-fa163ebea4e5/volumes/kubernetes.io~secret/cattle-credentials
tmpfs 3.9G 12.0K 3.9G 0% /opt/rke/var/lib/kubelet/pods/5c672b02-18df-11e9-a246-fa163ebea4e5/volumes/kubernetes.io~secret/nginx-ingress-serviceaccount-token-vc522
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/c29dc914ee801d2b36d4d2b688e5b060be6297665187f1001f9190fc9ace009d/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/0591531eb89d598a8ef9bf49c6c21ea8250ad08489372d3ea5dbf561d44c9340/merged
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/c89f839b36e0f7317c78d806a1ffb24d43a21c472a2e8a734785528c22cce85b/shm
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/33050b02fc38091003e6a18385446f48989c8f64f9a02c64e41a8072beea817c/shm
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/a81da21f41c5c9eb2fb54ccdc187a26d5899f35933b4b701139d30f1af3860a4/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/f6d546b54d59a29526e4a9187fb75c22c194d28926fca5c9839412933c53ee9d/merged
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/7b0f9471bc66513589e79cc733ed6d69d897270902ffba5c9747b668d0f43472/shm
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/cae4765e9eb9004e1372b4b202e03a2a8d3880c918dbc27c676203eef7336080/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/81ee00944f4eb367d4dd06664a7435634916be55c1aa0329509f7a277a522909/merged
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/7888843c2e76b5c3c342a765517ec06edd92b9eab25d26655b0f5812742aa790/merged
tmpfs 3.9G 12.0K 3.9G 0% /opt/rke/var/lib/kubelet/pods/c19a2ca3-18df-11e9-a246-fa163ebea4e5/volumes/kubernetes.io~secret/default-token-nzc2d
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/4d1c7efa3af94c1bea63021b594704db4504d4d97f5c858bdb6fe697bdefff9b/merged
shm 64.0M 0 64.0M 0% /var/lib/docker/containers/e10b7da6d372d241bebcf838e2cf9e6d86ce29801a297a4e7278c7b7329e895d/shm
overlay 7.4G 5.2G 1.8G 74% /var/lib/docker/overlay2/50df5234e85a2854b27aa8c7a8e483ca755803bc8bf61c25060a6c14b50a932c/merged
I already tried to prune all docker systems on all nodes and reconfigured and restarted all.
Is it may be connected with the fact that all the volumes have a limit of 7.4 GB?
How can I increase the ephemeral-storage therefore?
Thanks a lot
Therefore, the Pod requests a total of 10GiB (8GiB+2GiB) of local ephemeral storage and enforces a limit of 12GiB of local ephemeral storage. It also sets emptyDir sizeLimit to 5GiB. With this setting in pod spec, it will affect how the scheduler makes a decision on scheduling pods and also how kubelet evict pods.
Kubernetes nodes have local ephemeral storage which is attached to the RAM or local writable devices. Pods typically use ephemeral volumes for caching, scratch space and logs.
You can use /bin/df as a tool to monitor ephemeral storage usage on the volume where ephemeral container data is located, which is /var/lib/kubelet and /var/lib/containers .
You can change via Azure CLI - Specify the --max-pods argument when you deploy a cluster with the az aks create command. The maximum value is 250. You can't change the maximum number of pods per node when you deploy a cluster with the Azure portal.
Is it may be connected with the fact that all the volumes have a limit of 7.4 GB?
You really have a single volume /dev/vda1
and multiple mount points and not several volumes with 7.4GB
Not sure where you are running Kubernetes but that looks like a virtual volume (in a VM). You can increase the size in the VM configuration or cloud provider and then run this to increase the size of the filesystem:
ext4:
$ resize2fs /dev/vda1
xfs:
$ xfs_growfs /dev/vda1
Other filesystems will have their own commands too.
The most common issue for running out of disk space on the master(s) is log files, so if that's the case you can set up a cleanup job for them or change the log size configs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With