What causes pods to be slow in kubernetes?

Tags:

kubernetes

Certain pods on my cluster are extremely slow in almost all aspects. Startup time, network, i/o.

I have minimized the application code in these containers and it seems to have no effect, these are basically minimal containers running a simple webapi with a health check endpoint.

I'm wondering someone can help me figure out what's wrong or debug this.

When I say slow in all aspects I mean a couple things

Very slow startup. I actually have to change my readiness probe initial delay to near 5 minutes.
Inside the container running any command is slow. Running an apt-get update takes near 5 minutes, even if the container has been running for hours.
Any connections to an RDS database will timeout for at least the first 10 minutes the pod is running, after that it's hit or miss, sometimes normal speed, sometimes we'll start getting connection timeouts again (mainly if the pod hasn't been used/requested for awhile).

On nearly identical pods with same base image the container will start in less than a couple seconds, and running an apt-get update will take maybe 3 seconds. I cannot for the life of me see what is different between the pods that causes some to be 'good pods', and others to be 'bad pods'.

Running any of these images locally they will start in no time (less than a second or so).

My Environment

Cluster (AWS)

1 c4.large master
3 c4.xlarge nodes
~10-20 pods per node
provisioned with kops using 'standard' settings (I haven't done anything tricky)

Things I've checked/tried

too many pods

My first thought was maybe i'm running too many pods. I've launched up brand new nodes for this (c4.xlarge) and had this pod the only pod running in the cluster, issue still seen.
node resources

Checking every node level metric I could nothing looks out of the ordinary (also tried on several brand new pretty high powered nodes)
Deployment/Pod Metrics

I'm happy to show whatever metric anyone can think of here, nothing looks out of the norm. I have Prometheus running and have looked into every metric I could think to check. I can't see difference between a 'good' running pod and a 'bad' one.
cluster itself

I actually have 2 clusters, both provisioned with kops, this is seen on both clusters (though not always the same applications, which is odd).

Any help here is appreciated

710

asked Dec 12 '17 16:12

Kyle Gobel

1 Answers

This is likely happening either due to the configuration of Resource Limits that are too constrained or by the lack of configuration Resource Requests which is allowing pods to be provisioned on nodes which do not have the necessary requirements to run their workloads.

You can resolve this by defining proper resource requests for each of your applications that are deployed to Kubernetes. In a nutshell, you can control limits and requests for shares of CPU time, bytes of memory, and Linux Hugepages.

124

answered Oct 24 '22 00:10

TJ Zimmerman

Related questions
                            
                                How to mark secret as optional in kubernetes?
                            
                                kubectl ls -- or some other way to see into a POD
                            
                                unknown flag: --export while copying secret from one namespace to another kubectl
                            
                                Kubernetes NFS persistent volumes permission denied
                            
                                Minikube default CPU/Memory [closed]
                            
                                Helm Conditional Templates
                            
                                Helm upgrade with same chart version, but different Docker image tag
                            
                                Kubernetes custom-columns select element from array
                            
                                Docker Kubernetes (Mac) - Autoscaler unable to find metrics
                            
                                Programmatically get the name of the pod that a container belongs to in Kubernetes?
                            
                                kubectl behind a proxy
                            
                                ingress configuration for dashboard
                            
                                kubernetes RBAC role verbs to exec to pod
                            
                                running windows Container in Kubernetes over AWS cloud
                            
                                Kube-state-metrics error: Failed to create client: ... i/o timeout
                            
                                What's the difference between Kubernetes-native and non-native applications?
                            
                                Correctly keeping docker VSTS / Azure Devops build agent clean yet cached
                            
                                endpoints “default-http-backend” not found in Ingress resource
                            
                                How to expose kube-dns service for queries outside cluster?
                            
                                Appropriate Kubernetes Readiness and Liveness Probes for Kestrel .NET Core Web API

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With