Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the difference between count and total_count on an elasticsearch range facet?

I'm doing a search using a range facet:

{
"query": {
    "match_all": {}
},
"facets": {
    "prices": {
        "range": {
            "field": "product_price",
            "ranges": [
                {"from": 0, "to": 200},
                {"from": 200, "to": 400},
                {"from": 400, "to": 600},
                {"from": 600, "to": 800},
                {"from": 800}
            ]
        }
    }
}
}

And I got as response the ranges, as expected:

[
  {
    "from": 0.0,
    "to": 200.0,
    "count": 0,
    "total_count": 0,
    "total": 0.0,
    "mean": 0.0
  },
  {
    "from": 200.0,
    "to": 400.0,
    "count": 1,
    "min": 399.0,
    "max": 399.0,
    "total_count": 1,
    "total": 399.0,
    "mean": 399.0
  },
  {
    "from": 400.0,
    "to": 600.0,
    "count": 5,
    "min": 499.0,
    "max": 599.0,
    "total_count": 5,
    "total": 2886.0,
    "mean": 577.2
  },
  {
    "from": 600.0,
    "to": 800.0,
    "count": 3,
    "min": 690.0,
    "max": 790.0,
    "total_count": 3,
    "total": 2179.0,
    "mean": 726.3333333333334
  },
  {
    "from": 800.0,
    "count": 2,
    "min": 899.0,
    "max": 990.0,
    "total_count": 2,
    "total": 1889.0,
    "mean": 944.5
  }
]

In all responses the count and total_count are the same. Does anybody know what is the difference between them? Which one should I use?

like image 944
Lucas Cavalcanti Avatar asked May 27 '13 19:05

Lucas Cavalcanti


1 Answers

Very good question! This part is tricky since you see the same values most of the time, but... when you use the key_field and value_field you can compute the ranges based on a field and the aggregated data (min,max,total_count,total and mean) on another field. For instance you could compute the ranges on a popularity field and see the aggregated data on a price field, to see for every range of popularity what kind of price you have; maybe people like cheap products, or maybe not?

Let's imagine your products can have multiple prices, let's say for example a different price per country...this is when you have count that differs from total_count. Let's have a look at an example.

Let's index a couple of documents that contain a popularity field and a price field, which can have multiple values:

{
  "popularity": 50,
  "price": [28,30,32]
}

and

{
    "popularity": 120,
    "price": [50,54]
}

Let's now run the following search request, which builds a range facet using the popularity field as key and the price field as value:

{
    "query": {
        "match_all": {}
    },
    "facets": {
        "popularity_prices": {
            "range": {
                "key_field": "popularity",
                "value_field": "price",
                "ranges": [
                    {"to": 100},
                    {"from": 100}
                ]
            }
        }
    }
}

Here is the obtained facet:

{
    "popularity_prices": {
      "_type": "range",
      "ranges": [
        {
          "to": 100,
          "count": 1,
          "min": 28,
          "max": 32,
          "total_count": 3,
          "total": 90,
          "mean": 30
        },
        {
          "from": 100,
          "count": 1,
          "min": 50,
          "max": 54,
          "total_count": 2,
          "total": 104,
          "mean": 52
        }
      ]
    }
}

It should be clearer now what the total_count is. It relates to the value_field (price): 3 different price values fall into the first range, but they come from the same document. On the other hand count is the number of documents that fall into the range.

Now that we also understood the count is about documents while the total_count is about field values, we would expect the same behaviour with a normal range facet, if the field holds multiple values...right? Unfortunately that doesn't currently happen, the range facet will consider only the first value for each field. Not sure whether it's a bug. Therefore the count and the total_count are always the same.

like image 180
javanna Avatar answered Oct 18 '22 12:10

javanna