I'm facing SocketTimeoutException
while retrieving/inserting data from/to elastic. This is happening when there are around 10-30 request/second
. These requests are combination of get/put.
Here is my elastic configuration:
3 master nodes
each of 4GB RAM
2 data nodes
each of 8GM RAM
Rest High Level Client:
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<version>7.2.0</version>
</dependency>
<dependency>
<groupId>org.elasticsearch</groupId>
<artifactId>elasticsearch</artifactId>
<version>7.2.0</version>
</dependency>
Index Information:
27.2 MB
& Primaries: 12.2MB
{
"dev-index": {
"mappings": {
"properties": {
"dataObj": {
"type": "object",
"enabled": false
},
"generatedID": {
"type": "keyword"
},
"transNames": { //it's array of string
"type": "keyword"
}
}
}
}
}
Following is my elastic Config
file. Here I've two connection bean, one is for read & another for write to elastic.
ElasticConfig.java:
@Configuration
public class ElasticConfig {
@Value("${elastic.host}")
private String elasticHost;
@Value("${elastic.port}")
private int elasticPort;
@Value("${elastic.user}")
private String elasticUser;
@Value("${elastic.pass}")
private String elasticPass;
@Value("${elastic-timeout:20}")
private int timeout;
@Bean(destroyMethod = "close")
@Qualifier("readClient")
public RestHighLevelClient readClient(){
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(elasticUser, elasticPass));
RestClientBuilder builder = RestClient
.builder(new HttpHost(elasticHost, elasticPort))
.setHttpClientConfigCallback(httpClientBuilder ->
httpClientBuilder
.setDefaultCredentialsProvider(credentialsProvider)
.setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build())
);
builder.setRequestConfigCallback(requestConfigBuilder ->
requestConfigBuilder
.setConnectTimeout(10000)
.setSocketTimeout(60000)
.setConnectionRequestTimeout(0)
);
RestHighLevelClient restClient = new RestHighLevelClient(builder);
return restClient;
}
@Bean(destroyMethod = "close")
@Qualifier("writeClient")
public RestHighLevelClient writeClient(){
final CredentialsProvider credentialsProvider = new BasicCredentialsProvider();
credentialsProvider.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(elasticUser, elasticPass));
RestClientBuilder builder = RestClient
.builder(new HttpHost(elasticHost, elasticPort))
.setHttpClientConfigCallback(httpClientBuilder ->
httpClientBuilder
.setDefaultCredentialsProvider(credentialsProvider)
.setDefaultIOReactorConfig(IOReactorConfig.custom().setIoThreadCount(5).build())
);
builder.setRequestConfigCallback(requestConfigBuilder ->
requestConfigBuilder
.setConnectTimeout(10000)
.setSocketTimeout(60000)
.setConnectionRequestTimeout(0)
);
RestHighLevelClient restClient = new RestHighLevelClient(builder);
return restClient;
}
}
Here is the function which makes a call to elastic, if data is available in elastic it will take it else it will generate data & put into elastic.
public Object getData(Request request) {
DataObj elasticResult = elasticService.getData(request);
if(elasticResult!=null){
return elasticResult;
}
else{
//code to generate data
DataObj generatedData = getData();//some function which will generated data
//put above data into elastic by Async call.
elasticAsync.putData(generatedData);
return generatedData;
}
}
ElasticService.java getData Function:
@Service
public class ElasticService {
@Value("${elastic.index}")
private String elasticIndex;
@Autowired
@Qualifier("readClient")
private RestHighLevelClient readClient;
public DataObj getData(Request request){
String generatedId = request.getGeneratedID();
GetRequest getRequest = new GetRequest()
.index(elasticIndex) //elastic index name
.id(generatedId); //retrieving by index id from elastic _id field (as key-value)
DataObj result = null;
try {
GetResponse response = readClient.get(getRequest, RequestOptions.DEFAULT);
if(response.isExists()) {
ObjectMapper objectMapper = new ObjectMapper();
result = objectMapper.readValue(response.getSourceAsString(), DataObj.class);
}
} catch (Exception e) {
LOGGER.error("Exception occurred during fetch from elastic !!!! " + ,e);
}
return result;
}
}
ElasticAsync.java Async Put Data Function:
@Service
public class ElasticAsync {
private static final Logger LOGGER = Logger.getLogger(ElasticAsync.class.getName());
@Value("${elastic.index}")
private String elasticIndex;
@Autowired
@Qualifier("writeClient")
private RestHighLevelClient writeClient;
@Async
public void putData(DataObj generatedData){
ElasticVO updatedRequest = toElasticVO(generatedData);//ElasticVO matches to the structure of index given above.
try {
ObjectMapper objectMapper = new ObjectMapper();
String jsonString = objectMapper.writeValueAsString(updatedRequest);
IndexRequest request = new IndexRequest(elasticIndex);
request.id(generatedData.getGeneratedID());
request.source(jsonString, XContentType.JSON);
request.setRefreshPolicy(WriteRequest.RefreshPolicy.NONE);
request.timeout(TimeValue.timeValueSeconds(5));
IndexResponse indexResponse = writeClient.index(request, RequestOptions.DEFAULT);
LOGGER.info("response id: " + indexResponse.getId());
}
} catch (Exception e) {
LOGGER.error("Exception occurred during saving into elastic !!!!",e);
}
}
}
Here is the some part of the stack trace when exception is occurred during saving data into elastic:
2019-07-19 07:32:19.997 ERROR [data-retrieval,341e6ecc5b10f3be,1eeb0722983062b2,true] 1 --- [askExecutor-894] a.c.s.a.service.impl.ElasticAsync : Exception occurred during saving into elastic !!!!
java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-34 [ACTIVE]
at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.2.0.jar!/:7.2.0]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.2.0.jar!/:7.2.0]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.2.0.jar!/:7.2.0]
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1448) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1418) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1388) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
at org.elasticsearch.client.RestHighLevelClient.index(RestHighLevelClient.java:836) ~[elasticsearch-rest-high-level-client-7.2.0.jar!/:7.2.0]
Caused by: java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-34 [ACTIVE]
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
... 1 common frames omitted
Here is the some part of the stack trace when exception is occurred during retrieving data into elastic:
2019-07-19 07:22:37.844 ERROR [data-retrieval,104cf6b2ab5b3349,b302d3d3cd7ebc84,true] 1 --- [o-8080-exec-346] a.c.s.a.service.impl.ElasticService : Exception occurred during fetch from elastic !!!!
java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-30 [ACTIVE]
at org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:789) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:225) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestClient.performRequest(RestClient.java:212) ~[elasticsearch-rest-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1433) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1403) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1373) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
at org.elasticsearch.client.RestHighLevelClient.get(RestHighLevelClient.java:699) ~[elasticsearch-rest-high-level-client-7.1.1.jar!/:7.1.1]
Caused by: java.net.SocketTimeoutException: 60,000 milliseconds timeout on connection http-outgoing-30 [ACTIVE]
at org.apache.http.nio.protocol.HttpAsyncRequestExecutor.timeout(HttpAsyncRequestExecutor.java:387) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:92) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
at org.apache.http.impl.nio.client.InternalIODispatch.onTimeout(InternalIODispatch.java:39) ~[httpasyncclient-4.1.3.jar!/:4.1.3]
at org.apache.http.impl.nio.reactor.AbstractIODispatch.timeout(AbstractIODispatch.java:175) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.BaseIOReactor.sessionTimedOut(BaseIOReactor.java:263) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.AbstractIOReactor.timeoutCheck(AbstractIOReactor.java:492) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.BaseIOReactor.validate(BaseIOReactor.java:213) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.AbstractIOReactor.execute(AbstractIOReactor.java:280) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.BaseIOReactor.execute(BaseIOReactor.java:104) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
at org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor$Worker.run(AbstractMultiworkerIOReactor.java:591) ~[httpcore-nio-4.4.11.jar!/:4.4.11]
... 1 common frames omitted
I've gone through couple of stackoverflow
& elastic
related blogs where they have mentioned this issue could be due to RAM
& cluster
configuration of elastic. Then I've changed my shards from 5 to 2 as there were only two data nodes. Also increased ram of Data nodes from 4GB to 8GB, as I get to know that elastic will use only 50%
of total RAM
. The occurrences of exception have decreased but problem still persist.
What could be possible ways to solve this problem ? What I'm missing from java/elastic configuration point of view which frequently throwing this kind of SocketTimeoutException
? Let me know if you require any more details regarding the configuration.
We've had the same issue and after quite some digging I found the root cause: a config mismatch of the firewall between the client and the elastic servers kernel config for tcp keep alive.
The firewall drops idle connections after 3600 seconds. The problem was that the kernel parameter for the tcp keep alive was set to 7200 seconds (default in RedHat 6.x/7.x):
sysctl -n net.ipv4.tcp_keepalive_time
7200
So the connections are dropped before a keep alive probe is being sent. The asyncHttpClient in the elastic http client doesn't seem to handle dropped connections very well, it just waits until the socket timeout.
So check whether you have any network device (Loadbalancer, Firewall, Proxy etc.) between your client and server which has a session timeout or similar and either increase that timeout or lower the tcp_keep_alive kernel parameter.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With