We have a multi-node setup of our product where we need to deploy multiple Elasticsearch pods. As all these are data nodes and have volume mounts for persistent storage, we don't want to bring two pods up on the same node. I'm trying to use the anti-affinity feature of Kubernetes, but to no avail.
The cluster deployment is done through Rancher. We have 5 nodes in the cluster, and three nodes (let's say node-1
, node-2
and node-3
) have the label test.service.es-master: "true"
. So, when I deploy the helm chart and scale it up-to 3, Elasticsearch pods are up and running on all these three nodes. but if I scale it to 4, the 4th data node comes in one of the above mentioned nodes. Is that a correct behavior? My understanding was, imposing a strict anti-affinity should prevent the pods from coming up on the same node. I've referred to multiple blogs and forums (e.g. this and this), and they suggest similar changes as mine. I'm attaching the relevant section of the helm chart.
The requirement is, we need to bring up ES on only those nodes which are labelled with specific key-value pair as mentioned above, and each of those nodes should only contain one pod. Any feedback is appreciated.
apiVersion: v1
kind: Service
metadata:
creationTimestamp: null
labels:
test.service.es-master: "true"
name: {{ .Values.service.name }}
namespace: default
spec:
clusterIP: None
ports:
...
selector:
test.service.es-master: "true"
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
labels:
test.service.es-master: "true"
name: {{ .Values.service.name }}
namespace: default
spec:
selector:
matchLabels:
test.service.es-master: "true"
serviceName: {{ .Values.service.name }}
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: test.service.es-master
operator: In
values:
- "true"
topologyKey: kubernetes.io/hostname
replicas: {{ .Values.replicaCount }}
template:
metadata:
creationTimestamp: null
labels:
test.service.es-master: "true"
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: test.service.es-master
operator: In
values:
- "true"
topologyKey: kubernetes.io/hostname
securityContext:
...
volumes:
...
...
status: {}
Update-1
As per the suggestions in the comments and answers, I've added the anti-affinity section in template.spec. But unfortunately the issue still remains. The updated yaml looks like as follows:
apiVersion: v1
kind: Service
metadata:
creationTimestamp: null
labels:
test.service.es-master: "true"
name: {{ .Values.service.name }}
namespace: default
spec:
clusterIP: None
ports:
- name: {{ .Values.service.httpport | quote }}
port: {{ .Values.service.httpport }}
targetPort: {{ .Values.service.httpport }}
- name: {{ .Values.service.tcpport | quote }}
port: {{ .Values.service.tcpport }}
targetPort: {{ .Values.service.tcpport }}
selector:
test.service.es-master: "true"
---
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
creationTimestamp: null
labels:
test.service.es-master: "true"
name: {{ .Values.service.name }}
namespace: default
spec:
selector:
matchLabels:
test.service.es-master: "true"
serviceName: {{ .Values.service.name }}
replicas: {{ .Values.replicaCount }}
template:
metadata:
creationTimestamp: null
labels:
test.service.es-master: "true"
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: test.service.es-master
operator: In
values:
- "true"
topologyKey: kubernetes.io/hostname
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: test.service.es-master
operator: In
values:
- "true"
topologyKey: kubernetes.io/hostname
securityContext:
readOnlyRootFilesystem: false
volumes:
- name: elasticsearch-data-volume
hostPath:
path: /opt/ca/elasticsearch/data
initContainers:
- name: elasticsearch-data-volume
image: busybox
securityContext:
privileged: true
command: ["sh", "-c", "chown -R 1010:1010 /var/data/elasticsearch/nodes"]
volumeMounts:
- name: elasticsearch-data-volume
mountPath: /var/data/elasticsearch/nodes
containers:
- env:
{{- range $key, $val := .Values.data }}
- name: {{ $key }}
value: {{ $val | quote }}
{{- end}}
image: {{ .Values.image.registry }}/analytics/{{ .Values.image.repository }}:{{ .Values.image.tag }}
name: {{ .Values.service.name }}
ports:
- containerPort: {{ .Values.service.httpport }}
- containerPort: {{ .Values.service.tcpport }}
volumeMounts:
- name: elasticsearch-data-volume
mountPath: /var/data/elasticsearch/nodes
resources:
limits:
memory: {{ .Values.resources.limits.memory }}
requests:
memory: {{ .Values.resources.requests.memory }}
restartPolicy: Always
status: {}
As Egor suggested, you need podAntiAffinity:
apiVersion: apps/v1
kind: Deployment
metadata:
name: redis-cache
spec:
selector:
matchLabels:
app: store
replicas: 3
template:
metadata:
labels:
app: store
spec:
affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app
operator: In
values:
- store
topologyKey: "kubernetes.io/hostname"
Source: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#always-co-located-in-the-same-node
So, with your current label, it might look like this:
spec:
affinity:
nodeAffinity:
# node affinity stuff here
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: "test.service.es-master"
operator: In
values:
- "true"
topologyKey: "kubernetes.io/hostname"
Ensure that you put this in the correct place in your yaml, or else it won't work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With