I have executed the samples from the book "Kubernetes Up and Running" where a pod with a work queue is run, then a k8s job is created 5 pods to consume all the work on the queue. I have reproduced the yaml api objects below.
My Expectation is that once a k8s job completes then it's pods would be deleted but kubectl get pods -o wide
shows the pods are still around even though it reports 0/1 containers ready and they still seem to have ip addresses assigned see output below.
kubectl get pods
why is that not right after all the containers in the pod finish?Output from kubectl after all the pods have consumed all the messages.
kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE
consumers-bws9f 0/1 Completed 0 6m 10.32.0.35 gke-cluster1-default-pool-3796b2ee-rtcr
consumers-d25cs 0/1 Completed 0 6m 10.32.0.33 gke-cluster1-default-pool-3796b2ee-rtcr
consumers-jcwr8 0/1 Completed 0 6m 10.32.2.26 gke-cluster1-default-pool-3796b2ee-tpml
consumers-l9rkf 0/1 Completed 0 6m 10.32.0.34 gke-cluster1-default-pool-3796b2ee-rtcr
consumers-mbd5c 0/1 Completed 0 6m 10.32.2.27 gke-cluster1-default-pool-3796b2ee-tpml
queue-wlf8v 1/1 Running 0 22m 10.32.0.32 gke-cluster1-default-pool-3796b2ee-rtcr
The follow three k8s api calls were executed these are cut and pasted from the book samples.
Run a pod with a work queue
apiVersion: extensions/v1beta1
kind: ReplicaSet
metadata:
labels:
app: work-queue
component: queue
chapter: jobs
name: queue
spec:
replicas: 1
template:
metadata:
labels:
app: work-queue
component: queue
chapter: jobs
spec:
containers:
- name: queue
image: "gcr.io/kuar-demo/kuard-amd64:1"
imagePullPolicy: Always
Expose the pod as a service so that the worker pods can get to it.
apiVersion: v1
kind: Service
metadata:
labels:
app: work-queue
component: queue
chapter: jobs
name: queue
spec:
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: work-queue
component: queue
Post 100 items to the queue then run a job with 5 pods executing in parallel until the queue is empty.
apiVersion: batch/v1
kind: Job
metadata:
labels:
app: message-queue
component: consumer
chapter: jobs
name: consumers
spec:
parallelism: 5
template:
metadata:
labels:
app: message-queue
component: consumer
chapter: jobs
spec:
containers:
- name: worker
image: "gcr.io/kuar-demo/kuard-amd64:1"
imagePullPolicy: Always
args:
- "--keygen-enable"
- "--keygen-exit-on-complete"
- "--keygen-memq-server=http://queue:8080/memq/server"
- "--keygen-memq-queue=keygen"
restartPolicy: OnFailure
First, confirm the name of the node you want to remove using kubectl get nodes , and make sure that all of the pods on the node can be safely terminated without any special procedures. Next, use the kubectl drain command to evict all user pods from the node.
what is the pod status completed mean ? That means inside pod's container process has been successfully completed.
How do you configure a Kubernetes Job so that Pods are retained after completion? - Configure the backofflimit parameter with a non-zero value. - Set a startingDeadlineSeconds value high enough to allow you to access the logs. - Set an activeDeadlineSeconds value high enough to allow you to access the logs.
The docs say it pretty well:
When a Job completes, no more Pods are created, but the Pods are not deleted either. Keeping them around allows you to still view the logs of completed pods to check for errors, warnings, or other diagnostic output. The job object also remains after it is completed so that you can view its status. It is up to the user to delete old jobs after noting their status. Delete the job with kubectl (e.g. kubectl delete jobs/pi or kubectl delete -f ./job.yaml). When you delete the job using kubectl, all the pods it created are deleted too.
It shows completed status when it actually terminated. If you set restartPloicy:Never( when you don't want to run more then once) then it goes to this state.
Terminated: Indicates that the container completed its execution and has stopped running. A container enters into this when it has successfully completed execution or when it has failed for some reason. Regardless, a reason and exit code is displayed, as well as the container’s start and finish time. Before a container enters into Terminated, preStop hook (if any) is executed.
... State: Terminated Reason: Completed Exit Code: 0 Started: Wed, 30 Jan 2019 11:45:26 +0530 Finished: Wed, 30 Jan 2019 11:45:26 +0530 ...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With