Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cannot create a deployment that requests more than 2Gi memory

My deployment pod was evicted due to memory consumption:

  Type     Reason   Age   From                                             Message
  ----     ------   ----  ----                                             -------
  Warning  Evicted  1h    kubelet, gke-XXX-default-pool-XXX  The node was low on resource: memory. Container my-container was using 1700040Ki, which exceeds its request of 0.
  Normal   Killing  1h    kubelet, gke-XXX-default-pool-XXX  Killing container with id docker://my-container:Need to kill Pod

I tried to grant it more memory by adding the following to my deployment yaml:

apiVersion: apps/v1
kind: Deployment
...
spec:
  ...
  template:
    ...
    spec:
      ...
      containers:

      - name: my-container
        image: my-container:latest
        ...
        resources:
          requests:
            memory: "3Gi"

However, it failed to deploy:

  Type     Reason             Age               From                Message
  ----     ------             ----              ----                -------
  Warning  FailedScheduling   4s (x5 over 13s)  default-scheduler   0/3 nodes are available: 3 Insufficient memory.
  Normal   NotTriggerScaleUp  0s                cluster-autoscaler  pod didn't trigger scale-up (it wouldn't fit if a new node is added)

The deployment requests only one container.

I'm using GKE with autoscaling, the nodes in the default (and only) pool have 3.75 GB memory.

From trial and error, I found that the maximum memory I can request is "2Gi". Why can't I utilize the full 3.75 of a node with a single pod? Do I need nodes with bigger memory capacity?

like image 548
Mugen Avatar asked Feb 20 '19 12:02

Mugen


1 Answers

Even though the node has 3.75 GB of total memory, is very likely that the capacity allocatable is not all 3.75 GB.

Kubernetes reserve some capacity for the system services to avoid containers consuming too much resources in the node affecting the operation of systems services .

From the docs:

Kubernetes nodes can be scheduled to Capacity. Pods can consume all the available capacity on a node by default. This is an issue because nodes typically run quite a few system daemons that power the OS and Kubernetes itself. Unless resources are set aside for these system daemons, pods and system daemons compete for resources and lead to resource starvation issues on the node.

Because you are using GKE, is they don't use the defaults, running the following command will show how much allocatable resource you have in the node:

kubectl describe node [NODE_NAME] | grep Allocatable -B 4 -A 3

From the GKE docs:

Allocatable resources are calculated in the following way:

Allocatable = Capacity - Reserved - Eviction Threshold

For memory resources, GKE reserves the following:

  • 25% of the first 4GB of memory
  • 20% of the next 4GB of memory (up to 8GB)
  • 10% of the next 8GB of memory (up to 16GB)
  • 6% of the next 112GB of memory (up to 128GB)
  • 2% of any memory above 128GB

GKE reserves an additional 100 MiB memory on each node for kubelet eviction.

As the error message suggests, scaling the cluster will not solve the problem because each node capacity is limited to X amount of memory and the POD need more than that.

like image 154
Diego Mendes Avatar answered Sep 23 '22 16:09

Diego Mendes