I want to randomly jump to a page of results from elasticsearch. There are three ways to paginate in elasticsearch:
I know that anyway, Elasticsearch will sequentially read the data. Let's say if I wanted to get 99th page then elastic is going to read all 98 results to get the 99th result.
I can do one thing i.e. to reduce the data which I will sequentially get before the targeted data, in this case I will reduce the data returned for 98 pages and for the 99th one I will get the complete data.
My main question is "What if I don't have memory concerns then which approach would be faster to sequentially get 98 pages ?" (search_after or scrolls)
If I use scrolls I will be clearing it after every usage.
Elasticsearch provides three ways of paginating data that are each useful: From/Size Pagination. Search After Pagination. Scroll Pagination.
If a search request results in more than ten hits, ElasticSearch will, by default, only return the first ten hits. To override that default value in order to retrieve more or fewer hits, we can add a size parameter to the search request body.
The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request.
For inputs that use the multiline codec, a field is created called “offset” and it is stored in elasticsearch. What is this field represent or is used for? I am guessing it is the location of the first character in the log entry (or the last) in the log file that was parsed.
If you don't have memory concerns, then the simplest option is to increase the index setting index.max_result_window from 10000 to the number you require.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules.html#dynamic-index-settings
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With