Multiple pods of a 600 pod deployment stuck in ContainerCreating
after a rolling update with the message:
Failed create pod sandbox: rpc error: code = Unknown desc = NetworkPlugin cni failed to set up pod network: add cmd: failed to assign an IP address to container
What I have tried:
maxIPAddresses, value: 759.000000
ipamdActionInProgress, value: 1.000000
addReqCount, value: 16093.000000
awsAPILatency, value: 564.000000
delReqCount, value: 32337.000000
eniMaxAvailable, value: 69.000000
assignIPAddresses, value: 558.000000
totalIPAddresses, value: 682.000000
eniAllocated, value: 69.000000
Do the CNI metrics output suggest there's an issue? Seems like there are enough IPs.
What else can I try to debug?
To resolve it, double check the pod specification and ensure that the repository and image are specified correctly. If this still doesn't work, there may be a network issue preventing access to the container registry. Look in the describe pod text file to obtain the hostname of the Kubernetes node.
This is the case on Amazon Elastic Kubernetes Service (EKS) where the maximum number of pods per node depends on the instance type. For example, for a t2. medium instance, the maximum number of pods is 17, for t2. small it's 11, and for t2.
The action of deleting the pod is simple. To delete the pod you have created, just run kubectl delete pod nginx . Be sure to confirm the name of the pod you want to delete before pressing Enter. If you have completed the task of deleting the pod successfully, pod nginx deleted will appear in the terminal.
The pod sandbox is the abstraction that replaces the "pause" container that is used to keep namespaces open in every Kubernetes pod today.
It seems that you reached maximum number of IP addresses in your subnet what can suggest such thing in documentation:
maxIPAddress: the maximum number of IP addresses that can be used for Pods in the cluster. (assumes there is enough IPs in the subnet).
Please take a look also on maxUnavailable and maxSurge parameters which controls how many PODs appear during rolling upgrade - maybe your configuration assumes that during rolling upgrade you will have over 600 PODs (like 130%) and that hit limits of your AWS network.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With