Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

elastic search double facet

I want to run an elastic search query which groups data by the combination of two different fields (Latitude and Longitude)

curl -XGET http://www.my_server:9200/idx_occurrence/Occurrence/_search?pretty=true -d '{  
    "query": { 
        "query_string" : { 
            "fields" : ["genus_interpreted","dataset"], 
            "query": "Pica 2", 
            "default_operator" : "AND" 
         } 
    }, 
    "facets": { 
        "test": { 
            "terms": { 
                "fields" :["decimalLatitude","decimalLongitude"],
                "size" : 500000000 
            } 
        } 
    } 
}'

It gives a double number of results than expected... any idea?

The more relevants parts of the answer are...

_shards":{
    "total":5,
    "successful":5,
    "failed":0
},
"hits":{
    "total":**37**,
    "max_score":3.9314494,
    "hits":[{

the total hits, 37 is the result of the query if I don't apply the facets. This total is the half of the total in facets (see below)

"facets":{
    "test":{
        "_type":"terms",
        "missing":0,
        "total":**74**,
        "other":0,
        "terms":[
           {"term":"167.21665954589844","count":5},
           {"term":"167.25","count":4},
           {"term":"167.14999389648438","count":4},
           {"term":"167.1041717529297","count":4},
           {"term":"-21.04166603088379","count":4},.....

So, the facet grouping is done separetely (by latitude and then by longitude).

Please notice that I cannot group only by latitude or longitude, as multiple records can share latitude (but have different longitude) or viceversa.

like image 701
user1249791 Avatar asked Aug 31 '12 08:08

user1249791


1 Answers

You are making a TermsFacet on multiple fields: latitude and longitude. That means that latitude and longitude are aggregated together as they were an unique field. You see an entry for each single value, which can be either a latitude or a longitude. The fact that you get 74 entries back proves that you have 74 distinct latitude and longitude values in your index, which makes sense. What do you want to achieve exactly? One facet entry for each latitude longitude pair? In that case you have two options:

  • Add an additional field to the index which contains the pair itself and then facet on it
  • Create the latitue longitude pair on the fly using a term script. Have a look at the documentation to know more. Here is an example that should help, give it a try:
{
    "query" : {
        "match_all" : { }
    },
    "facets" : {
        "tags" : { 
            "terms" : {
                "field" : "latitude",
                "script" : "term + \"_\" + _source.longitude"
            }
        }
    }
}
like image 150
javanna Avatar answered Oct 18 '22 17:10

javanna