I have a small database in Elasticsearch and for testing purposes would like to pull all records back. I am attempting to use a URL of the form...
http://localhost:9200/foo/_search?pretty=true&q={'matchAll':{''}}
Can someone give me the URL you would use to accomplish this, please?
Elasticsearch will get significant slower if you just add some big number as size, one method to use to get all documents is using scan and scroll ids. The results from this would contain a _scroll_id which you have to query to get the next 100 chunk. This answer needs more updates. search_type=scan is now deprecated.
You can specify a size parameter (which defaults to 10) to determine the number of results to be returned. This is limited at 10000, as you should use a scroll query if you want to retrieve larger volumes of data.
If a search request results in more than ten hits, ElasticSearch will, by default, only return the first ten hits. To override that default value in order to retrieve more or fewer hits, we can add a size parameter to the search request body.
I think lucene syntax is supported so:
http://localhost:9200/foo/_search?pretty=true&q=*:*
size defaults to 10, so you may also need &size=BIGNUMBER
to get more than 10 items. (where BIGNUMBER equals a number you believe is bigger than your dataset)
BUT, elasticsearch documentation suggests for large result sets, using the scan search type.
EG:
curl -XGET 'localhost:9200/foo/_search?search_type=scan&scroll=10m&size=50' -d ' { "query" : { "match_all" : {} } }'
and then keep requesting as per the documentation link above suggests.
EDIT: scan
Deprecated in 2.1.0.
scan
does not provide any benefits over a regular scroll
request sorted by _doc
. link to elastic docs (spotted by @christophe-roussy)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With