Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch Scan&scroll with JEST API

I am currently working with JEST: https://github.com/searchbox-io/Jest

Is it possible to do scan&scroll with this API?

http://www.elasticsearch.org/guide/reference/api/search/search-type/

I am currently using the Search command:

Search search = new Search("{\"size\" : "+RESULT_SIZE+", \"query\":{\"match_all\":{}}}");

but am worried about large result sets. If you use the Search command for this how do you set the "search_type=scan&scroll=10m&size=50" arguments?

like image 499
Ryan R. Avatar asked May 21 '13 00:05

Ryan R.


People also ask

What is Elasticsearch is used for?

Elasticsearch is a distributed search and analytics engine built on Apache Lucene. Since its release in 2010, Elasticsearch has quickly become the most popular search engine and is commonly used for log analytics, full-text search, security intelligence, business analytics, and operational intelligence use cases.

How do I check Elasticsearch?

Verify elasticsearch is running by typing $ smarts/bin/sm_service show. 2. Verify elasticsearch is serving requests from a browser on the same machine in Windows or using a tool like curl on Linux. A page specific to the browser will appear.

What is Elasticsearch scrolling?

The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request.

How do I retrieve more than 10000 results events in Elasticsearch?

By default, you cannot use from and size to page through more than 10,000 hits. This limit is a safeguard set by the index. max_result_window index setting. If you need to page through more than 10,000 hits, use the search_after parameter instead.


2 Answers

Is it possible to do scan&scroll with this API?

Yes it is. My implementation it's working like this.

Start the scroll search on elastic search:

    public SearchResult startScrollSearch (String type, Long size) throws IOException {

            String query = ConfigurationFactory.loadElasticScript("my_es_search_script.json");

            Search search = new Search.Builder(query)
                                            // multiple index or types can be added.
                                            .addIndex("myIndex")
                                            .addType(type)
                                            .setParameter(Parameters.SIZE, size)
                                            .setParameter(Parameters.SCROLL, "1m")
                                            .build();

                SearchResult searchResult = EsClientConn.getJestClient().execute(search);
                return searchResult;

        }

SearchResult object will return the first (size) itens off the search as usual but will return to a scrollId parameter that is a reference to remain resultSet that elasticSearch keeps in memory for you. Parameters.SCROLL, will define the time that this search will be keeped on memory.

For read the scrollId:

scrollId = searchResult.getJsonObject().get("_scroll_id").getAsString();

For read more items from the resultSet you should use something like follow:

public JestResult readMoreFromSearch(String scrollId, Long size) throws IOException {

    SearchScroll scroll = new SearchScroll.Builder(scrollId, "1m")
                .setParameter(Parameters.SIZE, size).build();

        JestResult searchResult = EsClientConn.getJestClient().execute(scroll);
        return searchResult;

}

Don't forget that each time you read from the result set a new scrollId is returned from elastic.

Please tell me if you have any doubt.

like image 112
Pedro Sequeira Avatar answered Sep 24 '22 23:09

Pedro Sequeira


Agreed we need to catch up however please open an issue if you need a feature.

Please check https://github.com/searchbox-io/Jest/blob/master/jest/src/test/java/io/searchbox/core/SearchScrollIntegrationTest.java at master

like image 24
Ferhat Sobay Avatar answered Sep 22 '22 23:09

Ferhat Sobay