Example query: <pre class="prettyprint"><code>GET hostname:port /myIndex/_search { "size": 10000, "query": { "term": { "field": "myField" } } } </code></pre> I have been using the size option knowing that: <blockquote> index.max_result_window = 100000 </blockquote> But if my query has the size of 650,000 Documents for example or even more, how can I retrieve all of the results in one GET? I have been reading about the SCROLL, FROM-TO, and the PAGINATION API, but all of them never deliver more than 10K. This is the example from Elasticsearch Forum, that I have been using: <pre class="prettyprint"><code>GET /_search?scroll=1m </code></pre> Can anybody provide an example where you can retrieve all the documents for a GET search query?

Scroll is the way to go if you want to retrieve a high number of documents, high in the sense that it's way over the 10000 default limit, which can be raised. The first request needs to specify the query you want to make and the <code>scroll</code> parameter with duration before the search context times out (1 minute in the example below) <pre class="prettyprint"><code>POST /index/type/_search?scroll=1m { "size": 1000, "query": { "match" : { "title" : "elasticsearch" } } } </code></pre> In the response to that first call, you get a <code>_scroll_id</code> that you need to use to make the second call: <pre class="prettyprint"><code>POST /_search/scroll { "scroll" : "1m", "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ==" } </code></pre> In each subsequent response, you'll get a new <code>_scroll_id</code> that you need to use for the next call until you've retrieved the amount of documents you need. So in pseudo code it looks somewhat like this: <pre class="prettyprint"><code># first request response = request('POST /index/type/_search?scroll=1m') docs = [ response.hits ] scroll_id = response._scroll_id # subsequent requests while (true) { response = request('POST /_search/scroll', scroll_id) docs.push(response.hits) scroll_id = response._scroll_id } </code></pre> UPDATE: Please refer to the following answer which is more accurate regarding the best solution for deep pagination: Elastic Search - Scroll behavior

Note that from + size can not be more than the index.max_result_window index setting which defaults to 10,000. https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-from-size.html So You'll have TWO approches here: 1.add the your query the "track_total_hits": true variable. <div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div class="snippet-code"> <pre class="prettyprint snippet-code-html lang-html prettyprint-override"><code>GET index/_search { "size":1, "track_total_hits": true }</code></pre> </div> </div> 2.Use the Scroll API, but then you can't do the from,size in the ordinary way and you'll have to use the Scroll API. https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-scroll.html for example: <div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div class="snippet-code"> <pre class="prettyprint snippet-code-html lang-html prettyprint-override"><code> POST /twitter/_search?scroll=1m { "size": 100, "query": { "match" : { "title" : "elasticsearch" } } }</code></pre> </div> </div>

How do I retrieve more than 10000 results/events in Elastic-search

Tags:

elasticsearch

Example query:

GET hostname:port /myIndex/_search {      "size": 10000,     "query": {         "term": { "field": "myField" }     } }

I have been using the size option knowing that:

index.max_result_window = 100000

But if my query has the size of 650,000 Documents for example or even more, how can I retrieve all of the results in one GET?

I have been reading about the SCROLL, FROM-TO, and the PAGINATION API, but all of them never deliver more than 10K.

This is the example from Elasticsearch Forum, that I have been using:

GET /_search?scroll=1m

Can anybody provide an example where you can retrieve all the documents for a GET search query?

658

asked Jan 14 '17 22:01

Franco

2 Answers

Scroll is the way to go if you want to retrieve a high number of documents, high in the sense that it's way over the 10000 default limit, which can be raised.

The first request needs to specify the query you want to make and the scroll parameter with duration before the search context times out (1 minute in the example below)

POST /index/type/_search?scroll=1m {     "size": 1000,     "query": {         "match" : {             "title" : "elasticsearch"         }     } }

In the response to that first call, you get a _scroll_id that you need to use to make the second call:

POST /_search/scroll  {     "scroll" : "1m",      "scroll_id" : "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="  }

In each subsequent response, you'll get a new _scroll_id that you need to use for the next call until you've retrieved the amount of documents you need.

So in pseudo code it looks somewhat like this:

# first request response = request('POST /index/type/_search?scroll=1m') docs = [ response.hits ] scroll_id = response._scroll_id  # subsequent requests while (true) {    response = request('POST /_search/scroll', scroll_id)    docs.push(response.hits)    scroll_id = response._scroll_id }

UPDATE:

Please refer to the following answer which is more accurate regarding the best solution for deep pagination: Elastic Search - Scroll behavior

133

answered Sep 20 '22 07:09

Val

Note that from + size can not be more than the index.max_result_window index setting which defaults to 10,000.

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-from-size.html

So You'll have TWO approches here:

1.add the your query the "track_total_hits": true variable.

GET index/_search {     "size":1,     "track_total_hits": true }

2.Use the Scroll API, but then you can't do the from,size in the ordinary way and you'll have to use the Scroll API.

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-scroll.html

for example:

 POST /twitter/_search?scroll=1m { "size": 100, "query": {     "match" : {         "title" : "elasticsearch"     } } }

answered Sep 23 '22 07:09

Eran Peled

Related questions
                            
                                Kibana query exact match
                            
                                How to not-analyze in ElasticSearch?
                            
                                How to search nested objects with Elasticsearch
                            
                                ElasticSearch, multi-match with filter?
                            
                                How to access Kibana from Amazon elasticsearch service?
                            
                                How do I do a partial match in Elasticsearch?
                            
                                What is the default user and password for elasticsearch?
                            
                                CURL escape single quote
                            
                                How to log all executed elasticsearch queries
                            
                                ElasticSearch - Optimal number of Shards per node
                            
                                How to update a document using elasticsearch-py?
                            
                                ElasticSearch group by multiple fields
                            
                                How to make elasticsearch add the timestamp field to every document in all indices?
                            
                                How to do "where not exists" type filtering in Kibana/ELK?
                            
                                elasticsearch - what to do with unassigned shards
                            
                                Elasticsearch- get all values for a given field?
                            
                                Elastic Kibana - install as windows service
                            
                                Elasticsearch always returning "mapping type is missing"
                            
                                Setting Elastic search limit to "unlimited"
                            
                                Elasticsearch Bulk Index JSON Data

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With