Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch, how to return unique values of two fields

I have an index with 20 different fields. I need to be able to pull unique docs where combination of fields "cat" and "sub" are unique. In SQL it would look this way: select unique cat, sub from table A; I can do it for one field this way:

{
"size": 0,
"aggs" : {
    "unique_set" : {
        "terms" : { "field" : "cat" }
    }
}}

but how do I add another field to check uniqueness across two fields?

Thanks,

like image 829
epipko Avatar asked Nov 26 '22 06:11

epipko


2 Answers

SQL's SELECT DISTINCT [cat], [sub] can be imitated with a Composite Aggregation.

{
  "size": 0, 
  "aggs": {
    "cat_sub": {
      "composite": {
        "sources": [
          { "cat": { "terms": { "field": "cat" } } },
          { "sub": { "terms": { "field": "sub" } } }
        ]
      }
    }
  }
}

Returns...

"buckets" : [
  {
    "key" : {
      "cat" : "a",
      "sub" : "x"
    },
    "doc_count" : 1
  },
  {
    "key" : {
      "cat" : "a",
      "sub" : "y"
    },
    "doc_count" : 2
  },
  {
    "key" : {
      "cat" : "b",
      "sub" : "y"
    },
    "doc_count" : 3
  }
]
like image 169
Kyle McClellan Avatar answered Dec 06 '22 16:12

Kyle McClellan


The only way to solve this are probably nested aggregations:

{
"size": 0,
    "aggs" : {
        "unique_set_1" : {

            "terms" : {
                     "field" : "cats"
            },
            "aggregations" : { 
                "unique_set_2": {
                    "terms": {"field": "sub"}
                }
            }
        }
    }

}
like image 21
Lilith Wittmann Avatar answered Dec 06 '22 16:12

Lilith Wittmann