Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch aggregate nested fields as part of parent document

I have index with Product entities and nested Variation entities. Product entity consist of Id, Title and nested variations. Variation entity consist of Color, Size and Price fields. I need to aggregate search result by Color, Size and Price fields to get number of products for each color, size and price groups. If I use nested aggregation for these fields I get correct buckes but the number of documents in buckets is number of Variation entities per bucket. But I need to get number of Product entities (root documents) per bucket.

For example, the first product has variation (red, small, $10), (green, small, $10), (red, medium, $11) and the second product has variations (green, medium, $15). Nested aggregation returns 2 for red and 2 for small because 2 variations has red color and small size. But I need number of products (root entities) per bucket which should be 1 for red and 1 for small.

I also can't use children documents instead of nested documents because of other requirements.

How to compose the query to get this result?

Here is the mapping:

{
  "product": {
    "properties": {
      "id": {
        "type": "long"
      },
      "title": {
        "type": "string"
      },
      "brand": {
        "type": "string"
      },
      "variations": {
        "type": "nested",
        "properties": {
          "id": {
            "type": "long"
          },
          "colour": {
            "type": "string"
          },
          "size": {
            "type": "string"
          },
          "price": {
            "type": "double"
          }
        }
      },
      "location": {
        "type": "geo_point"
      }
    }
  }
}

And here is a query

{
  "aggs": {
    "Variations": {
      "nested": {
        "path": "variations"
      },
      "aggs": {
        "Colous": {
          "terms": {
            "field": "variations.colour"
          }
        },
        "Sizes": {
          "terms": {
            "field": "variations.size"
          }
        }
      }
    },
    "Brands": {
      "terms": {
        "field": "brand"
      }
    }
  },
  "query": {
    "match_all": {}
  }
}

The Brand aggregation works well because it gets number of root documents per group but nested aggregations return number of nested documents instead of number of root documents.

like image 709
Random Avatar asked Oct 09 '15 14:10

Random


1 Answers

You've tackled the problem the right way. Now you can simply use the reverse_nested aggregation in order to "join back" to the root product and get the count of matching products for each for your variations.

{
  "aggs": {
    "Variations": {
      "nested": {
        "path": "variations"
      },
      "aggs": {
        "Colous": {
          "terms": {
            "field": "variations.colour"
          },
          "aggs": {
            "product_count": {            <--- add this reverse nested agg
              "reverse_nested": {}
            }
          }
        },
        "Sizes": {
          "terms": {
            "field": "variations.size"
          },
          "aggs": {
            "product_count": {            <--- add this reverse nested agg
              "reverse_nested": {}
            }
          }
        }
      }
    },
    "Brands": {
      "terms": {
        "field": "brand"
      }
    }
  },
  "query": {
    "match_all": {}
  }
}

In the response, you'll see that:

  • 2 products are matching colour: green
  • 1 product is matching colour: red
  • 2 products are matching size: medium
  • 1 product is matching size: small
like image 56
Val Avatar answered Sep 28 '22 15:09

Val