I deployed an elasticsearch cluster with official Helm chart (https://github.com/elastic/helm-charts/tree/master/elasticsearch).
There are 3 Helm releases:
Cluster was running fine, I did a crash test by removing master release, and re-create it.
After that, master nodes are ok, but data nodes complain:
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid xeQ6IVkDQ2es1CO2yZ_7rw than local cluster uuid 9P9ZGqSuQmy7iRDGcit5fg, rejecting
which is normal because master nodes are new.
How can I fix data nodes cluster state without removing data folder?
Edit:
I know the reason why is broken, I know a basic solution is to remove data folder and restart node (as I can see on elastic forum, lot of similar questions without answers). But I am looking for a production aware solution, maybe with https://www.elastic.co/guide/en/elasticsearch/reference/current/node-tool.html tool?
Using elasticsearch-node utility, it's possible to reset cluster state, then the fresh node can join another cluster.
The tricky thing is to use this utility bin with Docker, because elasticsearch server must be stopped!
Solution with kubernetes:
kubectl scale data-nodes --replicas=0job.yaml:
apiVersion: batch/v1
kind: Job
metadata:
name: test-fix-cluster-m[0-3]
spec:
template:
spec:
containers:
- args:
- -c
- yes | elasticsearch-node detach-cluster; yes | elasticsearch-node remove-customs '*'
# uncomment for at least 1 PVC
#- yes | elasticsearch-node unsafe-bootstrap -v
command:
- /bin/sh
image: docker.elastic.co/elasticsearch/elasticsearch:7.10.1
name: elasticsearch
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: es-data
restartPolicy: Never
volumes:
- name: es-data
persistentVolumeClaim:
claimName: es-test-master-es-test-master-[0-3]
If you are interested, here the code behind unsafe-bootstrap: https://github.com/elastic/elasticsearch/blob/master/server/src/main/java/org/elasticsearch/cluster/coordination/UnsafeBootstrapMasterCommand.java#L83
I have written a small story at https://medium.com/@thomasdecaux/fix-broken-elasticsearch-cluster-405ad67ee17c.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With