Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I aggregate filtered nested documents in ElasticSearch?

Suppose I have an index with nested document that looks like this:

{
    "id" : 1234
    "cars" : [{
            "id" : 987
            "name" : "Volkswagen"
        }, {
            "id": 988
            "name" : "Tesla"
        }
    ]
}

I now want to get a count aggregation of "car" documents that match a certain criteria, e.g. that match a search query. My initial attempt was the following query:

{
  "query" : {
    "nested" : {
      "path" : "cars",
      "query" : {
        "query_string" : {
          "fields" : ["cars.name"],
          "query" : "Tes*"
        }
      }
    }
  },
  "aggregations" : {
    "cars" :{
      "nested" : {
        "path" : "cars"
      },
      "aggs" : {
        "cars" : {
          "terms" : {
            "field" : "cars.id"
          }
        }
      }
    }
  }
}

I was hoping here to get an aggregation result with only the ids of cars whose name begin with "Tes". However, the aggregation instead uses all cars that are in a top-level document that also contains a matching nested documents. That is, in the above example "Volkswagen" would also be counted because the top-level document also contains a car that does match.

How can I get an aggregation of just the matching nested documents?

like image 861
Tiddo Avatar asked Mar 05 '15 10:03

Tiddo


People also ask

What is a nested aggregation?

A special single bucket aggregation that enables aggregating nested documents. For example, lets say we have an index of products, and each product holds the list of resellers - each having its own price for the product.

What is filtered aggregate?

Filters aggregationedit A multi-bucket aggregation where each bucket contains the documents that match a query. In the above example, we analyze log messages. The aggregation will build two collection (buckets) of log messages - one for all those containing an error, and another for all those containing a warning.

What is composite aggregation in Elasticsearch?

This aggregation provides a way to stream all buckets of a specific aggregation, similar to what scroll does for documents. The composite buckets are built from the combinations of the values extracted/created for each document and each combination is considered as a composite bucket.

How does Elasticsearch do aggregation?

Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.


1 Answers

In the mean time I've figured it out: to achieve this a filter aggregation should be added around the the terms aggregation like so:

  "aggregations" : {
    "cars" :{
      "nested" : {
        "path" : "cars"
      },
      "aggs" : {
        "cars-filter" : {
          "filter" : {
            "query" : {
              "query_string" : {
                "fields" : ["cars.name"],
                "query" : "Tes*"
              }
            }  
          },
          "aggs" : {
            "cars" : {
              "terms" : {
                "field" : "cars.id"
              }
            }
          }
        }
      }
    }
  }
like image 60
Tiddo Avatar answered Oct 02 '22 07:10

Tiddo