Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pod in pending state due to Insufficient CPU

Tags:

On my GCE Kubernetes cluster I can no longer create pods.

Warning FailedScheduling    pod (www.caveconditions.com-f1be467e31c7b00bc983fbe5efdbb8eb-438ef) failed to fit in any node fit failure on node (gke-prod-cluster-default-pool-b39c7f0c-c0ug): Insufficient CPU 

Looking at the allocated stats of that node

Non-terminated Pods:        (8 in total)   Namespace         Name                                        CPU Requests    CPU Limits  Memory Requests Memory Limits   ---------         ----                                        ------------    ----------  --------------- -------------   default           dev.caveconditions.com-n80z8                            100m (10%)  0 (0%)      0 (0%)      0 (0%)   default           lamp-cnmrc                                  100m (10%)  0 (0%)      0 (0%)      0 (0%)   default           mongo-2-h59ly                                   200m (20%)  0 (0%)      0 (0%)      0 (0%)   default           www.caveconditions.com-tl7pa                            100m (10%)  0 (0%)      0 (0%)      0 (0%)   kube-system           fluentd-cloud-logging-gke-prod-cluster-default-pool-b39c7f0c-c0ug       100m (10%)  0 (0%)      200Mi (5%)  200Mi (5%)   kube-system           kube-dns-v17-qp5la                              110m (11%)  110m (11%)  120Mi (3%)  220Mi (5%)   kube-system           kube-proxy-gke-prod-cluster-default-pool-b39c7f0c-c0ug              100m (10%)  0 (0%)      0 (0%)      0 (0%)   kube-system           kubernetes-dashboard-v1.1.0-orphh                       100m (10%)  100m (10%)  50Mi (1%)   50Mi (1%) Allocated resources:   (Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)   CPU Requests  CPU Limits  Memory Requests Memory Limits   ------------  ----------  --------------- -------------   910m (91%)    210m (21%)  370Mi (9%)  470Mi (12%) 

Sure I have 91% allocated and can not fit another 10% into it. But is it not possible to over commit resources?

The usage of the server is at about 10% CPU average

enter image description here

Would be a shame if I can not use more ressources.

like image 707
Chris Avatar asked Aug 10 '16 09:08

Chris


People also ask

What happens when pod hits CPU limit?

If a container attempts to exceed the specified limit, the system will throttle the container.

What is POD CPU usage?

Pod CPU use is the aggregate of the CPU use of all containers in a pod. Likewise, pod memory utilization refers to the total aggregate of memory used by all containers in a pod.


2 Answers

I recently had this same issue, after some research I found that GKE has a default LimitRange with CPU requests limit set to 100m, this can be checked by running kubectl get limitrange -o=yaml. It's going to display something like this:

apiVersion: v1 items: - apiVersion: v1   kind: LimitRange   metadata:     annotations:       kubectl.kubernetes.io/last-applied-configuration: |         {"apiVersion":"v1","kind":"LimitRange","metadata":{"annotations":{},"name":"limits","namespace":"default"},"spec":{"limits":[{"defaultRequest":{"cpu":"100m"},"type":"Container"}]}}     creationTimestamp: 2017-11-16T12:15:40Z     name: limits     namespace: default     resourceVersion: "18741722"     selfLink: /api/v1/namespaces/default/limitranges/limits     uid: dcb25a24-cac7-11e7-a3d5-42010a8001b6   spec:     limits:     - defaultRequest:         cpu: 100m       type: Container kind: List metadata:   resourceVersion: ""   selfLink: "" 

This limit is applied to every container. So for instance, if you have a 4 cores node, and assuming that for each POD of yours 2 containers are going to be created, it will allow only for around ~20 pods to be created.

The "fix" here is to change the default LimitRange setting your own limits, and then removing old pods so they are recreated with the updated values, or to directly set the pods limits when creating them.

Some reading material:

https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit

https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod

https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#how-pods-with-resource-limits-are-run

https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits

like image 200
jonathancardoso Avatar answered Sep 20 '22 01:09

jonathancardoso


I had the same issue when attempting to deploy to the cluster. In my case, there were unneeded pods being automatically created for test branches of my application. To diagnose the issue, I needed to do:

kubectl get po

kubectl describe po - for one of the existing pods, to check which node it's running on

kubectl get nodes

kubectl describe node - to see the CPU usage for the node being used by the existing pod, as below:

Allocated resources:   (Total limits may be over 100 percent, i.e., overcommitted.)   Resource                       Requests      Limits   --------                       --------      ------   cpu                            1010m (93%)   4 (210%) 

Then, the unneeded pods could be deleted using:

kubectl get deployments

kubectl delete deployment .... - then name of the deployment for the pod I needed to delete.

Once I deleted enough unused pods, I was ably to deploy new ones.

like image 39
Chris Halcrow Avatar answered Sep 23 '22 01:09

Chris Halcrow