Let's say I have a similar situation explained here: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-post-filter.html
Before I stumbled upon this article, I have been using filter instead of post_filter for this kind of scenario, and it produced output just like the post_filter.
My question is: Are they the same thing? If not, which one is the recommended and more efficient method to use and why?
The post_filter is applied to the search hits at the very end of a search request, after aggregations have already been calculated.
Frequently used filters will be cached automatically by Elasticsearch, to speed up performance. Filter context is in effect whenever a query clause is passed to a filter parameter, such as the filter or must_not parameters in the bool query, the filter parameter in the constant_score query, or the filter aggregation.
Term queryedit. Returns documents that contain an exact term in a provided field. You can use the term query to find documents based on a precise value such as a price, a product ID, or a username.
As far as search hits are concerned, they are the same thing, i.e. the hits you get will be correctly filtered according to either your filter in a filtered
query or the filter in your post_filter
.
However, as far as aggregations are concerned, the end result will not be the same. The difference between both boils down to what document set the aggregations will be computed on.
If your filter is in a filtered
query, then your aggregations will be computed on the document set selected by the query(ies) and the filter(s) in your filtered
query, i.e. the same set of documents that you will get in the response.
If your filter is in a post_filter
, then your aggregations will be computed on the document set selected by your various query(ies). Once aggregations have been computed on that document set, the latter is further filtered by the filter(s) in your post_filter
before returning the matching documents.
To sum it up,
filtered
query affects both search results and aggregations post_filter
only affects the search results but NOT the aggregations
Another important difference between filter
and post_filter
that wasn't mentioned in any of the answers: performance.
TL;DR
Don't use post_filter
unless you actually need it for aggregations.
From The Definitive Guide:
WARNING: Performance consideration
Use a post_filter only if you need to differentially filter search results and aggregations. Sometimes people will use
post_filter
for regular searches.Don’t do this! The nature of the
post_filter
means it runs after the query, so any performance benefit of filtering (such as caches) is lost completely.The
post_filter
should be used only in combination with aggregations, and only when you need differential filtering.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With