Pod in pending state due to Insufficient CPU

Tags:

On my GCE Kubernetes cluster I can no longer create pods.

Warning FailedScheduling    pod (www.caveconditions.com-f1be467e31c7b00bc983fbe5efdbb8eb-438ef) failed to fit in any node fit failure on node (gke-prod-cluster-default-pool-b39c7f0c-c0ug): Insufficient CPU

Looking at the allocated stats of that node

Non-terminated Pods:        (8 in total)   Namespace         Name                                        CPU Requests    CPU Limits  Memory Requests Memory Limits   ---------         ----                                        ------------    ----------  --------------- -------------   default           dev.caveconditions.com-n80z8                            100m (10%)  0 (0%)      0 (0%)      0 (0%)   default           lamp-cnmrc                                  100m (10%)  0 (0%)      0 (0%)      0 (0%)   default           mongo-2-h59ly                                   200m (20%)  0 (0%)      0 (0%)      0 (0%)   default           www.caveconditions.com-tl7pa                            100m (10%)  0 (0%)      0 (0%)      0 (0%)   kube-system           fluentd-cloud-logging-gke-prod-cluster-default-pool-b39c7f0c-c0ug       100m (10%)  0 (0%)      200Mi (5%)  200Mi (5%)   kube-system           kube-dns-v17-qp5la                              110m (11%)  110m (11%)  120Mi (3%)  220Mi (5%)   kube-system           kube-proxy-gke-prod-cluster-default-pool-b39c7f0c-c0ug              100m (10%)  0 (0%)      0 (0%)      0 (0%)   kube-system           kubernetes-dashboard-v1.1.0-orphh                       100m (10%)  100m (10%)  50Mi (1%)   50Mi (1%) Allocated resources:   (Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md)   CPU Requests  CPU Limits  Memory Requests Memory Limits   ------------  ----------  --------------- -------------   910m (91%)    210m (21%)  370Mi (9%)  470Mi (12%)

Sure I have 91% allocated and can not fit another 10% into it. But is it not possible to over commit resources?

The usage of the server is at about 10% CPU average

enter image description here

Would be a shame if I can not use more ressources.

707

asked Aug 10 '16 09:08

Chris

2 Answers

I recently had this same issue, after some research I found that GKE has a default LimitRange with CPU requests limit set to 100m, this can be checked by running kubectl get limitrange -o=yaml. It's going to display something like this:

apiVersion: v1 items: - apiVersion: v1   kind: LimitRange   metadata:     annotations:       kubectl.kubernetes.io/last-applied-configuration: |         {"apiVersion":"v1","kind":"LimitRange","metadata":{"annotations":{},"name":"limits","namespace":"default"},"spec":{"limits":[{"defaultRequest":{"cpu":"100m"},"type":"Container"}]}}     creationTimestamp: 2017-11-16T12:15:40Z     name: limits     namespace: default     resourceVersion: "18741722"     selfLink: /api/v1/namespaces/default/limitranges/limits     uid: dcb25a24-cac7-11e7-a3d5-42010a8001b6   spec:     limits:     - defaultRequest:         cpu: 100m       type: Container kind: List metadata:   resourceVersion: ""   selfLink: ""

This limit is applied to every container. So for instance, if you have a 4 cores node, and assuming that for each POD of yours 2 containers are going to be created, it will allow only for around ~20 pods to be created.

The "fix" here is to change the default LimitRange setting your own limits, and then removing old pods so they are recreated with the updated values, or to directly set the pods limits when creating them.

Some reading material:

https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit

https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod

https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#how-pods-with-resource-limits-are-run

https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits

200

answered Sep 20 '22 01:09

jonathancardoso

I had the same issue when attempting to deploy to the cluster. In my case, there were unneeded pods being automatically created for test branches of my application. To diagnose the issue, I needed to do:

kubectl get po

kubectl describe po - for one of the existing pods, to check which node it's running on

kubectl get nodes

kubectl describe node - to see the CPU usage for the node being used by the existing pod, as below:

Allocated resources:   (Total limits may be over 100 percent, i.e., overcommitted.)   Resource                       Requests      Limits   --------                       --------      ------   cpu                            1010m (93%)   4 (210%)

Then, the unneeded pods could be deleted using:

kubectl get deployments

kubectl delete deployment .... - then name of the deployment for the pod I needed to delete.

Once I deleted enough unused pods, I was ably to deploy new ones.

answered Sep 23 '22 01:09

Chris Halcrow

Related questions
                            
                                SQL query to determine if a JSON value contains a specified attribute
                            
                                How do I compile my TypeScript code for Node.js to one file?
                            
                                Why is (18446744073709551615 == -1) true?
                            
                                What happens when JavaScript variable name and function name is the same?
                            
                                Skip system checks on Django server in DEBUG mode in Pycharm
                            
                                Fastest way to insert 100,000+ records into DocumentDB
                            
                                How to access a return value from a node script in BASH?
                            
                                How to build .csproj with C# 7 code from command line (msbuild)
                            
                                Gradle: compile tests but do not run them
                            
                                "Variable context not set" error with mutate_at, dplyr version >= 0.7
                            
                                How can I get children elements from QueryList in Angular 2?
                            
                                When to use std::invoke instead of simply calling the invokable?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With