I am currently working with JEST: https://github.com/searchbox-io/Jest
Is it possible to do scan&scroll with this API?
http://www.elasticsearch.org/guide/reference/api/search/search-type/
I am currently using the Search command:
Search search = new Search("{\"size\" : "+RESULT_SIZE+", \"query\":{\"match_all\":{}}}");
but am worried about large result sets. If you use the Search command for this how do you set the "search_type=scan&scroll=10m&size=50" arguments?
Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Since its release in 2010, Elasticsearch has quickly become the most popular search engine and is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases.
Verify elasticsearch is running by typing $ smarts/bin/sm_service show. 2. Verify elasticsearch is serving requests from a browser on the same machine in Windows or using a tool like curl on Linux. A page specific to the browser will appear.
The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request.
By default, you cannot use from and size to page through more than 10,000 hits. This limit is a safeguard set by the index. max_result_window index setting. If you need to page through more than 10,000 hits, use the search_after parameter instead.
Is it possible to do scan&scroll with this API?
Yes it is. My implementation it's working like this.
Start the scroll search on elastic search:
public SearchResult startScrollSearch (String type, Long size) throws IOException {
String query = ConfigurationFactory.loadElasticScript("my_es_search_script.json");
Search search = new Search.Builder(query)
// multiple index or types can be added.
.addIndex("myIndex")
.addType(type)
.setParameter(Parameters.SIZE, size)
.setParameter(Parameters.SCROLL, "1m")
.build();
SearchResult searchResult = EsClientConn.getJestClient().execute(search);
return searchResult;
}
SearchResult object will return the first (size) itens off the search as usual but will return to a scrollId parameter that is a reference to remain resultSet that elasticSearch keeps in memory for you. Parameters.SCROLL, will define the time that this search will be keeped on memory.
For read the scrollId:
scrollId = searchResult.getJsonObject().get("_scroll_id").getAsString();
For read more items from the resultSet you should use something like follow:
public JestResult readMoreFromSearch(String scrollId, Long size) throws IOException {
SearchScroll scroll = new SearchScroll.Builder(scrollId, "1m")
.setParameter(Parameters.SIZE, size).build();
JestResult searchResult = EsClientConn.getJestClient().execute(scroll);
return searchResult;
}
Don't forget that each time you read from the result set a new scrollId is returned from elastic.
Please tell me if you have any doubt.
Agreed we need to catch up however please open an issue if you need a feature.
Please check https://github.com/searchbox-io/Jest/blob/master/jest/src/test/java/io/searchbox/core/SearchScrollIntegrationTest.java at master
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With