Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch Java High-Level REST Client establish a bunch of TCP connection and doesn't close them after indexing data

I have a periodic job that has been run every second (this is configurable).

In this job, I first create a connection to Elasticsearch server:

RestHighLevelClient client = new RestHighLevelClient(
                    RestClient.builder(new HttpHost(address, port, "http")));

Then I check for the existence of a special index called test. If it doesn't exist, I create it first.

GetIndexRequest indexRequest = new GetIndexRequest();
indexRequest.indices("test");
boolean testIndexIsExists = false;
try {           
     testIndexIsExists = client.indices().exists(indexRequest, RequestOptions.DEFAULT); 
    } catch (IOException ioe) {
    logger.error("Can't check the existence of test index in Elasticsearch!");  
}
if(testIndexIsExists) {
     // bulk request...
} else {
    CreateIndexRequest testIndex = new CreateIndexRequest("test");
    try {   
        testIndex.mapping("doc", mappingConfiguration);
        client.indices().create(testIndex, RequestOptions.DEFAULT);
        // bulk request...  
    } catch (IOException ioe) { 
        logger.error("Can't create test index in Elasticsearch");
    }   
}

And after doing a bulk request that has near 2000 document, I close the Elasticsearch client connection:

client.close();

Java High Level REST Client version:

<dependency>
    <groupId>org.elasticsearch.client</groupId>
    <artifactId>elasticsearch-rest-high-level-client</artifactId>
    <version>6.4.0</version>
</dependency>

My problem is a bunch of TCP connection that has been established and don't be closed. These TCP connections occupy all operating system TCP connections over time.

On the other hand, I'm a bit confused. Should RestHighLevelClient instance be a singleton object for the entire application or I must create a new instance in every job running cycle and close the instance after doing that job?

like image 210
Saeed Hassanvand Avatar asked Feb 18 '19 11:02

Saeed Hassanvand


1 Answers

The high level client is already maintaining a connection pool for you, so I would use it as a singleton. Constantly creating and closing connection pools is expensive, and the client and underlying HTTP connection pool are thread safe. Also, calling close() on the client just delegates to the Apache HTTP client shutdown() method, so you're at the mercy of how they handle cleanup and releasing resources.

If you're using Spring or some other DI framework, it's easy to create a singleton instance of the client that can be injected as needed. And you can add the call to client.close() as part of the bean shutdown/destroy lifecycle phase.

Quick example using Spring Boot:

@Configuration
@ConditionalOnClass(RestHighLevelClient.class)
public class ElasticSearchConfiguration {

    @Value("${elasticsearch.address}")
    String address;

    @Value("${elasticsearch.port}")
    int port;

    @Bean(destroyMethod = "close")
    public RestHighLevelClient restHighLevelClient() {
        return new RestHighLevelClient(
                RestClient.builder(new HttpHost(address, port, "http")));
    }
}

Note: In this case Spring will automatically detect that the bean has a close method and call it for you when the bean is destroyed. Other frameworks may require you to specify how shutdown should be handled.

like image 161
Mike Avatar answered Oct 22 '22 12:10

Mike