Apparently the GC of my Kubernetes cluster is failing to delete any image and the server is getting to full-disk.
Can you please guide me on where to find the logs for the ImageGC with the error trying to delete the images or to a reason of why this is happening?
3m 5d 1591 ip-xxx.internal Node Warning FreeDiskSpaceFailed {kubelet ip-xxx.internal} failed to garbage collect required amount of images. Wanted to free 6312950988, but freed 0
3m 5d 1591 ip-xxx.internal Node Warning ImageGCFailed {kubelet ip-xxx.internal} failed to garbage collect required amount of images. Wanted to free 6312950988, but freed 0
Thanks!
To troubleshoot the issue of node disk pressure, you need to figure out what files are taking up the most space. Since Kubernetes is running on Linux, this is easily done by running the du command. You can either manually SSH into each Kubernetes node, or use a DaemonSet.
Kubernetes checks for and deletes objects that no longer have owner references, like the pods left behind when you delete a ReplicaSet. When you delete an object, you can control whether Kubernetes deletes the object's dependents automatically, in a process called cascading deletion.
Cordon will mark the node as unschedulable. Uncordon will mark the node as schedulable. The given node will be marked unschedulable to prevent new pods from arriving. Then drain deletes all pods except mirror pods (which cannot be deleted through the API server).
A valid owner reference consists of the object name and a UID within the same namespace as the dependent object. Kubernetes sets the value of this field automatically for objects that are dependents of other objects like ReplicaSets, DaemonSets, Deployments, Jobs and CronJobs, and ReplicationControllers.
There may not be much in the way of logs (see this issue) but there may be Kubernetes event data. Look for events of type ImageGCFailed
.
Alternatively you could check the cadvisor Prometheus metrics to see if it exposes any information about container garbage collecton.
Docs on the GC feature in general: https://kubernetes.io/docs/concepts/cluster-administration/kubelet-garbage-collection/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With