Assuming deployment, replicaSet and pod are all 1:1:1 mapping.
deployment ==> replicaSet ==> Pod
When we do deployment, replicaSet adds pod-template-hash
label to pods. So, this looks enough for a replicaSet to check if enough pods are running. Then what is the significance of replicaSet matchLabels
selector? Why is it mandatory?
To explain for better understanding
For ex: I deploy an app with these labels. 2 pods are running
spec:
replicas: 2
selector:
matchLabels:
app: nginx-app
Now change label value of pod-template-hash to something else for one of the pods (changing to testing
here). Now we immediately see another pod started. So replicaSet does not seem to care about selector.matchLabels
NAME READY STATUS RESTARTS AGE LABELS
pod/nginx-app-b8b875889-cpnnr 1/1 Running 0 53s app=nginx-app,pod-template-hash=testing
pod/nginx-app-b8b875889-jlk6m 1/1 Running 0 53s app=nginx-app,pod-template-hash=b8b875889
pod/nginx-app-b8b875889-xblqr 1/1 Running 0 11s app=nginx-app,pod-template-hash=b8b875889
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE LABELS
service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 151d component=apiserver,provider=kubernetes
NAME READY UP-TO-DATE AVAILABLE AGE LABELS
deployment.apps/nginx-app 2/2 2 2 53s app=nginx-app
NAME DESIRED CURRENT READY AGE LABELS
replicaset.apps/nginx-app-b8b875889 2 2 2 53s app=nginx-app,pod-template-hash=b8b875889
A ReplicaSet's purpose is to maintain a stable set of replica Pods running at any given time. As such, it is often used to guarantee the availability of a specified number of identical Pods.
A ReplicaSet is a process that runs multiple instances of a Pod and keeps the specified number of Pods constant. Its purpose is to maintain the specified number of Pod instances running in a cluster at any given time to prevent users from losing access to their application when a Pod fails or is inaccessible.
Kubernetes selector allows us to select Kubernetes resources based on the value of labels and resource fields assigned to a group of pods or nodes.
In this way, ReplicaSet ensures that the number of pods of an application is running on the correct scale as specified in the conf file. Whereas in the case of DaemonSet it will ensure that one copy of pod defined in our configuration will always be available on every worker node.
Let me summarize it. The whole discussion is about: Why deployment forces me to set matchLabels selector even though it could easly live without it, since its adding pod-template-hash and it would be totally fine with using only that.
After reading all the comments and all the discussion I decided to look in kubernetes documentation.
I will allow myself to quote k8s documentation about replicasets: How a ReplicaSet works
How a ReplicaSet works:
[...]
A ReplicaSet is linked to its Pods via the Pods' metadata.ownerReferences field, which specifies what resource the current object is owned by. All Pods acquired by a ReplicaSet have their owning ReplicaSet's identifying information within their ownerReferences field. It's through this link that the ReplicaSet knows of the state of the Pods it is maintaining and plans accordingly.
So does is mean that it's not using labels at all? Well, not exactly. Let's keep reading:
A ReplicaSet identifies new Pods to acquire by using its selector. If there is a Pod that has no OwnerReference or the OwnerReference is not a Controller and it matches a ReplicaSet's selector, it will be immediately acquired by said ReplicaSet
Ouh, so it looks like it is using the selector only as an alternative to the first method.
Let's keep reading. Here is a quote from Pod Selector section:
Pod Selector
The .spec.selector field is a label selector. As discussed earlier these are the labels used to identify potential Pods to acquire
It looks like these labels are not used as a primary method to keep track of pod owned by the ReplicaSet, they are use to "identify potential Pods to acquire". But what does it mean?
Why would ReplicaSet acquire pods it does not own? There is a section in documentation that tries to answer this very question: Non-Template Pod acquisition
Non-Template Pod acquisitions
While you can create bare Pods with no problems, it is strongly recommended to make sure that the bare Pods do not have labels which match the selector of one of your ReplicaSets. The reason for this is because a ReplicaSet is not limited to owning Pods specified by its template-- it can acquire other Pods in the manner specified in the previous sections.
[...]
As those Pods do not have a Controller (or any object) as their owner reference and match the selector of the [...] ReplicaSet, they will immediately be acquired by it.
Great, but this still does not answer the question: Why do I need to provide the selector? Couldn't it just use that hash?
Back in the past when there was a bug in k8s: https://github.com/kubernetes/kubernetes/issues/23170 so someone suggested the validation is needed: https://github.com/kubernetes/kubernetes/issues/23218 And so validation appeared: https://github.com/kubernetes/kubernetes/pull/23530
And it stayed with us to this day, even if today we probably could live without it.
Although I think its better that it's there because it minimizes the chances of overlaping labels in case of pod-template-hash collision for different RSs.
one use case why we use pod-label "AND" pod-template-hash as Selector may be to handle the replicasets during updates/roll-back etc..
eg:-
In your scenario, the replicaset currently uses Selector app=nginx-app,pod-template-hash=b8b875889. consider the deployment is being updated to a later version of nginx image, as part of the upgrade it creates a new replicaset in the background which uses same selector but with new pod-template-hash, meaning the selector for the new replicaset will be "app=nginx-app,pod-template-hash=XXXXXXXX". As part of the upgrade the pods from old replicaset will be terminated and new pods will be created in the new replicaset. As the pod label (app=nginx-app) is common for both these replicasets, to manage them effectively and independently we need to use another selector which is unique for these replicasets. This is achieved by using pod-template-hash along with pod-label as selector.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With