Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Kubernetes OOM pod killed because kernel memory grows to much

I am working on a java service that basically creates files in a network file system to store data. It runs in a k8s cluster in a Ubuntu 18.04 LTS. When we began to limit the memory in kubernetes (limits: memory: 3Gi), the pods began to be OOMKilled by kubernetes.

At the beginning we thought it was a leak of memory in the java process, but analyzing more deeply we noticed that the problem is the memory of the kernel. We validated that looking at the file /sys/fs/cgroup/memory/memory.kmem.usage_in_bytes

We isolated the case to only create files (without java) with the DD command like this:

for i in {1..50000}; do dd if=/dev/urandom bs=4096 count=1 of=file$i; done

And with the dd command we saw that the same thing happened ( the kernel memory grew until OOM). After k8s restarted the pod, I got doing a describe pod:

  • Last State:Terminated
  • Reason: OOMKilled
  • Exit Code: 143

Creating files cause the kernel memory grows, deleting those files cause the memory decreases . But our services store data , so it creates a lot of files continuously, until the pod is killed and restarted because OOMKilled.

We tested limiting the kernel memory using a stand alone docker with the --kernel-memory parameter and it worked as expected. The kernel memory grew to the limit and did not rise anymore. But we did not find any way to do that in a kubernetes cluster. Is there a way to limit the kernel memory in a K8S environment ? Why the creation of files causes the kernel memory grows and it is not released ?

like image 446
Pablo Hadziatanasiu Avatar asked Dec 12 '18 22:12

Pablo Hadziatanasiu


People also ask

What is oom killed in Kubernetes?

OOMKilled is actually not native to Kubernetes—it is a feature of the Linux Kernel, known as the OOM Killer , which Kubernetes uses to manage container lifecycles. The OOM Killer mechanism monitors node memory and selects processes that are taking up too much memory, and should be killed.

How much RAM do I need for Kubernetes?

Each node in your cluster must have at least 300 MiB of memory.

What happens when pod hits CPU limit?

If a container attempts to exceed the specified limit, the system will throttle the container.

Why is Kubernetes pod restarting?

A restarting container can indicate problems with memory (see the Out of Memory section), cpu usage, or just an application exiting prematurely. If a container is being restarted because of CPU usage, try increasing the requested and limit amounts for CPU in the pod spec.


1 Answers

Thanks for all this info, it was very useful!

On my app, I solved this by creating a new side container that runs a cron job, every 5 minutes with the following command:

echo 3 > /proc/sys/vm/drop_caches

(note that you need the side container to run in privileged mode)

It works nicely and has the advantage of being predictable: every 5 minutes, your memory cache will be cleared.

like image 78
Cyrille99 Avatar answered Sep 24 '22 14:09

Cyrille99