It may be a beginner question, but I have some doubts related to size. As per elastic search specs, the maximum value of size can be 10000, I want to validate my understandings below:
Sample Query:
GET testindex-2016.04.14/_search
{
"size": 10000,
"query": {
"bool": {
"must": [
{
"match": {
"type": "TEST"
}
}
]
}
},
"aggs": {
"testAggs": {
"terms": {
"field": "type",
"size": 0
},
"aggs": {
"innerAggs": {
"max": {
"field": "Value"
}
}
}
}
}
}
Response:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"failed": 0
},
"hits": {
"total": 26949,
"max_score": 0,
"hits": [
.....10000 records
]
},
"aggregations": {
"test": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "TEST",
"doc_count": 26949,
"innerAggs": {
"value": 150
}
}
]
}
}
}
1 - If size is set to 10000, and we have more than 200000 records in elastic search satisfying the query, So in query result, I will get the no. of hits to 10000, but the "total" : 200000.
2 - How many records will be available for further aggregations - the no. of hits or the "total" value.
3 - If we set "size" : 0, in this case, we get hits = 0, but how many records will be available for aggregations ?
Please clarify my understanding, and put comments in any doubt in question . Thanks.
Goto Kibana and then click on "Dev Tools" from Left Menu, Once the console is open run the following command to see size of the given index.
If you are expecting more than 10,000 results from an Elasticsearch query, you will need to do an additional request to get the next 10,000 results. To get around this limitation you can use the "search_after" key to specify which record the search should start with.
The default size for a query is 10. You can change the size in the search parameter: Similar to retrieving more documents than you need, getting too many fields you don’t use will also slow down your search speed. This is due to the same reason we mentioned earlier – Elasticsearch will need to construct and transfer more documents to the client.
Optimizing your queries is one thing you can do to improve Elasticsearch’s search performance. A bad query that collects more document results than needed will decrease your search speed. Size parameter in Elasticsearch determines how many documents Elasticsearch will return in responses.
The scroll parameter tells Elasticsearch to keep the search context open for another 1m. The size parameter allows you to configure the maximum number of hits to be returned with each batch of results. Each call to the scroll API returns the next batch of results until there are no more results left to return, ie the hits array is empty.
By default, you cannot use from and size to page through more than 10,000 hits. This limit is a safeguard set by the index.max_result_window index setting. If you need to page through more than 10,000 hits, use the search_after parameter instead. Elasticsearch uses Lucene’s internal doc IDs as tie-breakers.
The size
parameter only tells how many hits should be returned in the response, so if you specify size: 10000 and 200000 records match, you'll get 10000 matching documents in the result hits, but total
will state 200000
aggregations are always computed on the full set of results, so the total
value.
See 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With