Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow index speed of Elasticsearch

We deployed ES 2.0 on 3 EC2 c4.4xlarge(16 cores, 32gb memory) nodes, allocating 16G for ES, attached 500GB with io1/4000 IOPS on each.

Problem : We are expecting great performance from this hardware config, however a very slow indexing speed is observed.

Our document is about 10-50k in size, we are using Java transport client to insert. The speed was alright for the first 50,000 at roughly 1000/second, and dramatically slow down to 100-200/second.

In the meanwhile we are looking at the low resource consumption:

  1. CPU is about 1-20% only (16 Core CPU)
  2. IO write is about 4-10Mb/second only
  3. Memory consumption is about 20-30% only

Requirements :So I cannot understand why it is so slow while all the recourses are so free, what can I do to enhance the efficiency? Thanks.

Here is the config file we are using:

cluster.name: {{ env }}-{{ app }}
path.data: /data/es
path.logs: /data/es-logs
network.host: 0.0.0.0
discovery.zen.ping.unicast.hosts: ["xxxx"]
bootstrap.mlockall: true
threadpool.search.queue_size: 300
threadpool.index.type: fixed
threadpool.index.size: 16
threadpool.index.queue_size: 250000
index.refresh_interval: 1s
index.translog.flush_threshold_ops: 50000
indices.memory.index_buffer_size: 30%
indices.memory.min_shard_index_buffer_size: 12mb
indices.memory.min_index_buffer_size: 96mb
script.inline: on
script.indexed: on
http.cors.enabled: true
http.cors.allow-origin: /https?:\/\/localhost(:[0-9]+)?/

Here is htop and iostat while running the job: htop

iostat

like image 559
PeiSong Xiong Avatar asked Nov 27 '15 07:11

PeiSong Xiong


People also ask

Why is Elasticsearch so slow?

Slow queries are often caused byPoorly written or expensive search queries. Poorly configured Elasticsearch clusters or indices. Saturated CPU, Memory, Disk and network resources on the cluster.

What is indexing rate in Elasticsearch?

With our updated cluster and NVMe usage, we can easily sustain an indexing rate of nearly 5 million records per second (averaging closer to 25,000 records per second per node).


1 Answers

Upgrade your ES to latest version, because in recent releases they have made it more production friendly and most stable release now is the latest one 2.3

You can try following things to make indexing go faster:

  1. Make some master nodes, separate from Data nodes as it will reduce load on all your cluster.
  2. Disable OS swapping, ES takes care of that and Check your heap size on all your machines Heap Sizing
  3. Check your documents are of similar size always, you can make use of bulk indexing and tweak you settings in there like chunk_size in number of records or in memory size
  4. If you are using script try to optimize that as they make the indexing slow, you can store the scripted value if possible as preprocessing, as ES is not designed to handle scripting.
  5. Check number of shards per node and try to balance that out across nodes using Routing
  6. Read more on how ES guys suggest production ready system to work Elasticsearch in Production
  7. One more blog on increasing Elasticsearch Indexing performance Performance Considerations for Elasticsearch Indexing

Check this answer for optimal way to setup ELK Stack on three servers. Optimal way to set up ELK stack on three servers

like image 72
Sumit Avatar answered Oct 31 '22 07:10

Sumit