Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to retrieve all _ids in ElasticSearch

What is the fastest way to get all _ids of a certain index from ElasticSearch? Is it possible by using a simple query? One of my index has around 20,000 documents.

like image 313
Mahoni Avatar asked Jul 05 '13 21:07

Mahoni


People also ask

How do I retrieve more than 10000 results events in Elasticsearch?

If you are expecting more than 10,000 results from an Elasticsearch query, you will need to do an additional request to get the next 10,000 results. To get around this limitation you can use the "search_after" key to specify which record the search should start with.

How do I fetch all indexes in Elasticsearch?

You can query localhost:9200/_status and that will give you a list of indices and information about each.

What is the Elasticsearch query to get all documents from an index?

Elasticsearch will get significant slower if you just add some big number as size, one method to use to get all documents is using scan and scroll ids. The results from this would contain a _scroll_id which you have to query to get the next 100 chunk. This answer needs more updates. search_type=scan is now deprecated.


1 Answers

Edit: Please read @Aleck Landgraf's Answer, too

You just want the elasticsearch-internal _id field? Or an id field from within your documents?

For the former, try

curl http://localhost:9200/index/type/_search?pretty=true -d ' {      "query" : {          "match_all" : {}      },     "stored_fields": [] } ' 

Note 2017 Update: The post originally included "fields": [] but since then the name has changed and stored_fields is the new value.

The result will contain only the "metadata" of your documents

{   "took" : 7,   "timed_out" : false,   "_shards" : {     "total" : 5,     "successful" : 5,     "failed" : 0   },   "hits" : {     "total" : 4,     "max_score" : 1.0,     "hits" : [ {       "_index" : "index",       "_type" : "type",       "_id" : "36",       "_score" : 1.0     }, {       "_index" : "index",       "_type" : "type",       "_id" : "38",       "_score" : 1.0     }, {       "_index" : "index",       "_type" : "type",       "_id" : "39",       "_score" : 1.0     }, {       "_index" : "index",       "_type" : "type",       "_id" : "34",       "_score" : 1.0     } ]   } } 

For the latter, if you want to include a field from your document, simply add it to the fields array

curl http://localhost:9200/index/type/_search?pretty=true -d ' {      "query" : {          "match_all" : {}      },     "fields": ["document_field_to_be_returned"] } ' 
like image 199
Thorsten Avatar answered Nov 13 '22 14:11

Thorsten