Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Query all unique values of a field with Elasticsearch

How do I search for all unique values of a given field with Elasticsearch?

I have such a kind of query like select full_name from authors, so I can display the list to the users on a form.

like image 959
kiran Avatar asked Jan 22 '13 19:01

kiran


People also ask

How do you list unique values of a specific field in Kibana?

Set you aggregation back to count and have a Split Rows as follows. Not doing this will give you count 1 for each field value (since it is looking for unique counts) when you populate the table. Noteworthy part is setting the Top field to 0. Because Kibana won't let you enter anything else than a digit (Obviously!).

How do you count unique values in Kibana?

You can use Visual Builder to show the amount of duplicates by bucket. So the metric will show the amount of duplicates in the latest time interval. If you want to show a total number of duplicates, the accurate way would be to increase the bucket so much that it basically contains all the data.

Is Elasticsearch good for aggregations?

Elasticsearch Aggregations provide you with the ability to group and perform calculations and statistics (such as sums and averages) on your data by using a simple search query. An aggregation can be viewed as a working unit that builds analytical information across a set of documents.


2 Answers

For Elasticsearch 1.0 and later, you can leverage terms aggregation to do this,

query DSL:

{   "aggs": {     "NAME": {       "terms": {         "field": "",         "size": 10       }     }   } } 

A real example:

{   "aggs": {     "full_name": {       "terms": {         "field": "authors",         "size": 0       }     }   } } 

Then you can get all unique values of authors field. size=0 means not limit the number of terms(this requires es to be 1.1.0 or later).

Response:

{     ...      "aggregations" : {         "full_name" : {             "buckets" : [                 {                     "key" : "Ken",                     "doc_count" : 10                 },                 {                     "key" : "Jim Gray",                     "doc_count" : 10                 },             ]         }     } } 

see Elasticsearch terms aggregations.

like image 24
Gary Gauh Avatar answered Sep 18 '22 20:09

Gary Gauh


You could make a terms facet on your 'full_name' field. But in order to do that properly you need to make sure you're not tokenizing it while indexing, otherwise every entry in the facet will be a different term that is part of the field content. You most likely need to configure it as 'not_analyzed' in your mapping. If you are also searching on it and you still want to tokenize it you can just index it in two different ways using multi field.

You also need to take into account that depending on the number of unique terms that are part of the full_name field, this operation can be expensive and require quite some memory.

like image 186
javanna Avatar answered Sep 18 '22 20:09

javanna