My ElasticSearch are not going to do some complicated query. I am using ElasticSearch just for fast searches performance on large datasets.
It is running fine. The search is simple and fast.
But with the documents in index become huge, adding new documents become slow and slower.
I would like to tune the ElasticSearch clusters to make it still return search results fast, but I also want it to be able indexing/adding documents fast even when index reaches size of 100 GB or bigger.
I would
So what changes I can make to the above setup to improve the indexing speed and performance, and reduce the error like Elasticsearch connection error in the process?
I am using AWS hosted Elasticsearch.
What else could I do?
Thanks!
Go to Control Panel | Indexing Options to monitor the indexing. The DisableBackOff = 1 option makes the indexing go faster than the default value. You can continue to work on the computer but indexing will continue in the background and is less likely to pause when other programs are running.
Slow queries are often caused byPoorly written or expensive search queries. Poorly configured Elasticsearch clusters or indices. Saturated CPU, Memory, Disk and network resources on the cluster.
With Elasticsearch, you generally want the max and min HEAP values to match to prevent HEAP from resizing at runtime. So when you're testing values of HEAP with your cluster, make sure that both values match. Elasticsearch's current guide states that there is an “ideal sweet spot” at around 64 GB of RAM.
When you index documents, Your es cluster tries to sync that data to other nodes as well. For Better indexing performance, some improvements can be done.
1 - Set large refresh_interval while indexing. This will delay data sync across nodes and make indexing faster.
2 - Keep optimum batch size, while bulk indexing.
3 - Set Heap size properly, For example for 64Gb node 31 Gb should be the optimum heap. For details - https://www.elastic.co/guide/en/elasticsearch/guide/current/heap-sizing.html
4 - Increase File Descriptors and MMap - https://www.elastic.co/guide/en/elasticsearch/guide/current/_file_descriptors_and_mmap.html
5 - If you are transforming your data while ingestion then dedicated ingestion node can be used - https://www.elastic.co/guide/en/elasticsearch/reference/current/modules-node.html
6 - Disable replication (you can enable it after big indexing)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With