Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Count distinct values using elasticsearch

I am learning elastic search and would like to count distinct values. So far I can count values but not distinct.

Here is the sample data:

curl http://localhost:9200/store/item/ -XPOST -d '{
  "RestaurantId": 2,
  "RestaurantName": "Restaurant Brian",
  "DateTime": "2013-08-16T15:13:47.4833748+01:00"
}'

curl http://localhost:9200/store/item/ -XPOST -d '{
  "RestaurantId": 1,
  "RestaurantName": "Restaurant Cecil",
  "DateTime": "2013-08-16T15:13:47.4833748+01:00"
}'

curl http://localhost:9200/store/item/ -XPOST -d '{
  "RestaurantId": 1,
  "RestaurantName": "Restaurant Cecil",
  "DateTime": "2013-08-16T15:13:47.4833748+01:00"
}'

And what I tried so far:

curl -XPOST "http://localhost:9200/store/item/_search" -d '{
  "size": 0,
  "aggs": {
    "item": {
      "terms": {
        "field": "RestaurantName"
      }
    }
  }
}'

Output:

{
  "took": 0,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0.0,
    "hits": []
  },
  "aggregations": {
    "item": {
      "buckets": [
        {
          "key": "restaurant",
          "doc_count": 3
        },
        {
          "key": "cecil",
          "doc_count": 2
        },
        {
          "key": "brian",
          "doc_count": 1
        }
      ]
    }
  }
}

How can I get count of cecil as 1 instead of 2

like image 450
Developer Avatar asked Jul 09 '14 16:07

Developer


People also ask

How do you count unique values in Kibana?

You can use a Metric visualization and just use the "count" metric for this. There are many ways to do this, generally in most visualizations, you can: use "Unique Count" on the personId field as the metric. use a terms aggregation on the organizationId field for the X-Axis (or split rows in a table visualization).

What is cardinality aggregation?

A single-value metrics aggregation that calculates an approximate count of distinct values. Values can be extracted either from specific fields in the document or generated by a script.


2 Answers

You have to use cardinality option as mentioned by @coder that you can find in the doc

$ curl -XGET "http://localhost:9200/store/item/_search" -d'
{
"aggs" : {
    "restaurant_count" : {
        "cardinality" : {
            "field" : "RestaurantName",
            "precision_threshold": 100, 
            "rehash": false 
            }
          }
         }
}'

This worked for me ...

like image 69
c24b Avatar answered Oct 05 '22 15:10

c24b


Use could use cardinality here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html

like image 26
coder Avatar answered Oct 05 '22 14:10

coder