On my GCE Kubernetes cluster I can no longer create pods.
Warning FailedScheduling pod (www.caveconditions.com-f1be467e31c7b00bc983fbe5efdbb8eb-438ef) failed to fit in any node fit failure on node (gke-prod-cluster-default-pool-b39c7f0c-c0ug): Insufficient CPU
Looking at the allocated stats of that node
Non-terminated Pods: (8 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits --------- ---- ------------ ---------- --------------- ------------- default dev.caveconditions.com-n80z8 100m (10%) 0 (0%) 0 (0%) 0 (0%) default lamp-cnmrc 100m (10%) 0 (0%) 0 (0%) 0 (0%) default mongo-2-h59ly 200m (20%) 0 (0%) 0 (0%) 0 (0%) default www.caveconditions.com-tl7pa 100m (10%) 0 (0%) 0 (0%) 0 (0%) kube-system fluentd-cloud-logging-gke-prod-cluster-default-pool-b39c7f0c-c0ug 100m (10%) 0 (0%) 200Mi (5%) 200Mi (5%) kube-system kube-dns-v17-qp5la 110m (11%) 110m (11%) 120Mi (3%) 220Mi (5%) kube-system kube-proxy-gke-prod-cluster-default-pool-b39c7f0c-c0ug 100m (10%) 0 (0%) 0 (0%) 0 (0%) kube-system kubernetes-dashboard-v1.1.0-orphh 100m (10%) 100m (10%) 50Mi (1%) 50Mi (1%) Allocated resources: (Total limits may be over 100%, i.e., overcommitted. More info: http://releases.k8s.io/HEAD/docs/user-guide/compute-resources.md) CPU Requests CPU Limits Memory Requests Memory Limits ------------ ---------- --------------- ------------- 910m (91%) 210m (21%) 370Mi (9%) 470Mi (12%)
Sure I have 91% allocated and can not fit another 10% into it. But is it not possible to over commit resources?
The usage of the server is at about 10% CPU average
Would be a shame if I can not use more ressources.
If a container attempts to exceed the specified limit, the system will throttle the container.
Pod CPU use is the aggregate of the CPU use of all containers in a pod. Likewise, pod memory utilization refers to the total aggregate of memory used by all containers in a pod.
I recently had this same issue, after some research I found that GKE has a default LimitRange
with CPU requests limit set to 100m
, this can be checked by running kubectl get limitrange -o=yaml
. It's going to display something like this:
apiVersion: v1 items: - apiVersion: v1 kind: LimitRange metadata: annotations: kubectl.kubernetes.io/last-applied-configuration: | {"apiVersion":"v1","kind":"LimitRange","metadata":{"annotations":{},"name":"limits","namespace":"default"},"spec":{"limits":[{"defaultRequest":{"cpu":"100m"},"type":"Container"}]}} creationTimestamp: 2017-11-16T12:15:40Z name: limits namespace: default resourceVersion: "18741722" selfLink: /api/v1/namespaces/default/limitranges/limits uid: dcb25a24-cac7-11e7-a3d5-42010a8001b6 spec: limits: - defaultRequest: cpu: 100m type: Container kind: List metadata: resourceVersion: "" selfLink: ""
This limit is applied to every container. So for instance, if you have a 4 cores node, and assuming that for each POD of yours 2 containers are going to be created, it will allow only for around ~20 pods to be created.
The "fix" here is to change the default LimitRange
setting your own limits, and then removing old pods so they are recreated with the updated values, or to directly set the pods limits when creating them.
Some reading material:
https://kubernetes.io/docs/tasks/configure-pod-container/assign-cpu-resource/#specify-a-cpu-request-and-a-cpu-limit
https://kubernetes.io/docs/tasks/administer-cluster/manage-resources/cpu-default-namespace/#create-a-limitrange-and-a-pod
https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/#how-pods-with-resource-limits-are-run
https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits
I had the same issue when attempting to deploy to the cluster. In my case, there were unneeded pods being automatically created for test branches of my application. To diagnose the issue, I needed to do:
kubectl get po
kubectl describe po
- for one of the existing pods, to check which node it's running on
kubectl get nodes
kubectl describe node
- to see the CPU usage for the node being used by the existing pod, as below:
Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 1010m (93%) 4 (210%)
Then, the unneeded pods could be deleted using:
kubectl get deployments
kubectl delete deployment ....
- then name of the deployment for the pod I needed to delete.
Once I deleted enough unused pods, I was ably to deploy new ones.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With