Elasticsearch, how to get all unique values of a field and count of total unique values?

Question

In Elasticsearch, We have used terms facet and terms aggregations to cope with the above mentioned problem. Unfortunately, this will surely work for small set of data. But we are dealing with data which would be around 10 million documents.

Hence, when we query to fetch all the unique values for field(Eg. company field) by using aggregation(setting "size":0) or facet(using "exclude"), we would not be able to get entire result in one stretch. It seems that elasticsearch would take lot of time to respond and ultimately it results in node failure.

The sole purpose of this process was to get count of how many unique values are present in a field(Eg. company, count of unique companies).

Any suggestions would be appreciable.

Martin Seeler · Accepted Answer

If you use Elasticsearch 1.1.0 or above, you can try to estimate the distinct counts with cardinal aggregations.

A simple query would look like this in your case:

POST /{yourIndex}/{yourType}/_search
{
    "aggs" : {
        "company_count" : {
            "cardinality" : {
                "field" : "company.company_raw",
                "precision_threshold": 10000
            }
        }
    }
}

Elasticsearch, how to get all unique values of a field and count of total unique values?

Tags:

java

aggregation

elasticsearch

facet

Shastry

1 Answers

Martin Seeler

Recent Activity

Donate For Us

Elasticsearch, how to get all unique values of a field and count of total unique values?

Tags:

java

aggregation

elasticsearch

facet

Shastry

1 Answers

Martin Seeler

Related questions

Recent Activity

Donate For Us