Kubernetes has a ton of pods in error state that can't seem to be cleared

Tags:

I was originally trying to run a Job that seemed to get stuck in a CrashBackoffLoop. Here was the service file:

apiVersion: batch/v1 kind: Job metadata:   name: es-setup-indexes   namespace: elk-test spec:   template:     metadata:       name: es-setup-indexes     spec:       containers:       - name: es-setup-indexes         image: appropriate/curl         command: ['curl -H  "Content-Type: application/json" -XPUT http://elasticsearch.elk-test.svc.cluster.local:9200/_template/filebeat -d@/etc/filebeat/filebeat.template.json']         volumeMounts:         - name: configmap-volume           mountPath: /etc/filebeat/filebeat.template.json           subPath: filebeat.template.json       restartPolicy: Never        volumes:         - name: configmap-volume           configMap:             name: elasticsearch-configmap-indexes

I tried deleting the job but it would only work if I ran the following command:

kubectl delete job es-setup-indexes --cascade=false

After that I noticed when running:

kubectl get pods -w

I would get a TON of pods in an Error state and I see no way to clean them up. Here is just a small sample of the output when I run get pods:

es-setup-indexes-zvx9c   0/1       Error     0         20h es-setup-indexes-zw23w   0/1       Error     0         15h es-setup-indexes-zw57h   0/1       Error     0         21h es-setup-indexes-zw6l9   0/1       Error     0         16h es-setup-indexes-zw7fc   0/1       Error     0         22h es-setup-indexes-zw9bw   0/1       Error     0         12h es-setup-indexes-zw9ck   0/1       Error     0         1d es-setup-indexes-zwf54   0/1       Error     0         18h es-setup-indexes-zwlmg   0/1       Error     0         16h es-setup-indexes-zwmsm   0/1       Error     0         21h es-setup-indexes-zwp37   0/1       Error     0         22h es-setup-indexes-zwzln   0/1       Error     0         22h es-setup-indexes-zx4g3   0/1       Error     0         11h es-setup-indexes-zx4hd   0/1       Error     0         21h es-setup-indexes-zx512   0/1       Error     0         1d es-setup-indexes-zx638   0/1       Error     0         17h es-setup-indexes-zx64c   0/1       Error     0         21h es-setup-indexes-zxczt   0/1       Error     0         15h es-setup-indexes-zxdzf   0/1       Error     0         14h es-setup-indexes-zxf56   0/1       Error     0         1d es-setup-indexes-zxf9r   0/1       Error     0         16h es-setup-indexes-zxg0m   0/1       Error     0         14h es-setup-indexes-zxg71   0/1       Error     0         1d es-setup-indexes-zxgwz   0/1       Error     0         19h es-setup-indexes-zxkpm   0/1       Error     0         23h es-setup-indexes-zxkvb   0/1       Error     0         15h es-setup-indexes-zxpgg   0/1       Error     0         20h es-setup-indexes-zxqh3   0/1       Error     0         1d es-setup-indexes-zxr7f   0/1       Error     0         22h es-setup-indexes-zxxbs   0/1       Error     0         13h es-setup-indexes-zz7xr   0/1       Error     0         12h es-setup-indexes-zzbjq   0/1       Error     0         13h es-setup-indexes-zzc0z   0/1       Error     0         16h es-setup-indexes-zzdb6   0/1       Error     0         1d es-setup-indexes-zzjh2   0/1       Error     0         21h es-setup-indexes-zzm77   0/1       Error     0         1d es-setup-indexes-zzqt5   0/1       Error     0         12h es-setup-indexes-zzr79   0/1       Error     0         16h es-setup-indexes-zzsfx   0/1       Error     0         1d es-setup-indexes-zzx1r   0/1       Error     0         21h es-setup-indexes-zzx6j   0/1       Error     0         1d kibana-kq51v   1/1       Running   0         10h

But if I look at the jobs I get nothing related to that anymore:

$ kubectl get jobs --all-namespaces                                                                               NAMESPACE     NAME               DESIRED   SUCCESSFUL   AGE kube-system   configure-calico   1         1            46d

I've also noticed that kubectl seems much slow to respond. I don't know if the pods are continuously trying to be restarted or in some broken state but would be great if someone could let me know how to troubleshoot as I have not come across another issue like this in kubernetes.

Kube info:

$ kubectl version  Client Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:44:38Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"6", GitVersion:"v1.6.1", GitCommit:"b0b7a323cc5a4a2019b2e9520c21c7830b7f708e", GitTreeState:"clean", BuildDate:"2017-04-03T20:33:27Z", GoVersion:"go1.7.5", Compiler:"gc", Platform:"linux/amd64"}

273

asked Jun 06 '17 00:06

xamox

1 Answers

kubectl delete pods --field-selector status.phase=Failed -n <your-namespace>

...cleans up any failed pods in your-namespace.

answered Oct 10 '22 12:10

Kevin Pedersen

Related questions
                            
                                Convert 1 to 01
                            
                                Android: How to Center title in ToolBar
                            
                                Mongodb not working on Ubuntu -> mongod.service: Failed with result 'exit-code'
                            
                                Read the current text color in a xterm
                            
                                How do I calculate the UILabel height dynamically [duplicate]
                            
                                What are the biggest time wasters for learning programming? [closed]
                            
                                Apache - MySQL Service detected with wrong path. / Ports already in use
                            
                                How to change UITextfield placeholder color and fontsize using swift 2.0?
                            
                                multidimensional array array_sum
                            
                                Count elements with int < 5 in List<T>
                            
                                toggleClass and remove class from all other elements
                            
                                Safari Image Size Auto Height CSS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With