Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to do pagination with elasticsearch? from vs scroll API

I'm using elasticsearch as DB to store a large batch of log data. I know there are 2 ways to do pagination:

  1. Use size and from API

  2. Use scroll API

Now I'm using 'from' to do pagination.Get page and size parameters from front end,and at back end(Java)

searchSourceBuilder.size(size);
searchSourceBuilder.from(page * size);

However, if page*size > 10000, an exception thrown from ES.

Can I use scroll API to do pagination?

I know that if I use scroll API, the searchResponse object will return me a _scroll_id, which looks like a base64 string.

How can I control page and size?

It seems Scroll API only support successive page number?

like image 766
Neilson3r Avatar asked Nov 14 '17 05:11

Neilson3r


People also ask

How do you implement pagination in Elasticsearch?

The Simplest to Implement The simplest method of pagination uses the from and size parameters available in Elasticsearch's search API. By default, from is 0 and size is 10, meaning if you don't specify otherwise, Elasticsearch will return only the first ten results from your index.

What is Elasticsearch pagination?

So, let's first start with pagination. Each time when we search something on the web, it returns a lot of results. These results can be in hundreds or thousands or sometimes in lakhs, which are distributed on several pages. Each page has multiple records. This mechanism is known as pagination.

How do I use Elasticsearch scrolling?

To perform a scroll search, you need to add the scroll parameter to a search query and specify how long Elasticsearch should keep the search context viable. This query will return a maximum of 5000 hits. If the scroll is idle for more than 40 seconds, it will be deleted.

What is scroll API in Elasticsearch?

The scroll parameter indicates how long Elasticsearch should retain the search context for the request. The search response returns a scroll ID in the _scroll_id response body parameter. You can then use the scroll ID with the scroll API to retrieve the next batch of results for the request.


1 Answers

There is nothing in Elasticsearch which allows direct jump to a specific page as the results have to be collected from different shards. So in your case search_after will be a better option. You can reduce the amount of data returned for the subsequent queries and then once you reach the page which is actually requested get the complete data.

Example: Let's say you have to jump to 99th page then you can reduce the amount of data for all 98th pages request and once you're at 99 you can get the complete data.

like image 114
TechnocratSid Avatar answered Sep 21 '22 09:09

TechnocratSid