Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does ElasticSearch and Lucene share the memory

I have one question about the following quota from ES official doc:

But if you give all available memory to Elasticsearch’s heap, 
there won’t be any left over for Lucene. 
This can seriously impact the performance of full-text search.

If my server has 80G memory, I issued the following command to start ES node: bin/elasticsearch -xmx 30g That means I only give the process of ES 30g memory maximum. How can Lucene use the left 50G, since Lucene is running in ES process, it's just part of the process.

like image 472
Jack Avatar asked Feb 05 '16 19:02

Jack


2 Answers

The Xmx parameter simply indicates how much heap you allocate to the ES Java process. But allocating RAM to the heap is not the only way to use the available memory on a server.

Lucene does indeed run inside the ES process, but Lucene doesn't only make use of the allocated heap, it also uses memory by heavily leveraging the file system cache for managing index segment files.

There were these two great blog posts (this one and this other one) from Lucene's main committer which explain in greater details how Lucene leverages all the available remaining memory.

The bottom line is to allocate 30GB heap to the ES process (using -Xmx30g) and then Lucene will happily consume whatever is left to do what needs to be done.

like image 71
Val Avatar answered Sep 27 '22 19:09

Val


Lucene uses the off heap memory via the OS. It is described in the Elasticsearch guide in the section about Heap sizing and swapping.

Lucene is designed to leverage the underlying OS for caching in-memory data structures. Lucene segments are stored in individual files. Because segments are immutable, these files never change. This makes them very cache friendly, and the underlying OS will happily keep hot segments resident in memory for faster access.

like image 21
NilsH Avatar answered Sep 27 '22 19:09

NilsH