Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Best way to import/export data from Elasticsearch using Java client

I am ingesting millions of records into Elasticsearch and also extracting records from Elasticsearch. I am actually using Elasticsearch Java client. I am creating only one client on each JVM. Using this client, ingesting data into Elasticsearch and also extrating data from Elasticsearch using same. Extracted data writing into files and doing some analysing, again writing into files and ingestng data back into elasticsearch using.

  1. Is this best way to create only one Java client on one JVM and keep it alive?

  2. or create client when needed and ingest/extract data, close it.

  3. Or create pool of clients and reuse it. (Like connection pooling)

What is the best way to do this ?

like image 379
Sky Avatar asked Jul 14 '16 18:07

Sky


1 Answers

Its a really good question but based on my experience with really scalable and huge Elasticsearch systems, I have not seen more than one ES client in single JVM, these clients are singleton and thread-safe and for below reasons you should also stick to single ES client per JVM.

  1. For performance reasons, you should do the optimization at elasticsearch client side, like in you case, you can have a dedicated ingest node to improve the ingestion speed.
  2. You can also have a dedicated co-ordinated node to improve search speed further.
  3. Having less client per JVM will also cause less memory utilization and less confusion in your application.
like image 130
Amit Avatar answered Sep 20 '22 05:09

Amit