Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch, ordering aggregations by geo distance and score

My mapping is the following:

PUT places
{
  "mappings": {
    "test": {
      "properties": {
        "id_product": { "type": "keyword" },
        "id_product_unique": { "type": "integer" },
        "location": { "type": "geo_point" },
        "suggest": {
          "type": "text"
        },
        "active": {"type": "boolean"}
      }
    }
  }
}

POST places/test
{
   "id_product" : "A",
   "id_product_unique": 1,
   "location": {
      "lat": 1.378446,
      "lon": 103.763427
   },
   "suggest": ["coke","zero"],
   "active": true
}

POST places/test
{
   "id_product" : "A",
   "id_product_unique": 2,
   "location": {
      "lat": 1.878446,
      "lon": 108.763427
   },
   "suggest": ["coke","zero"],
   "active": true
}

POST places/test
{
   "id_product" : "B",
   "id_product_unique": 3,
   "location": {
      "lat": 1.478446,
      "lon": 104.763427
   },
   "suggest": ["coke"],
   "active": true
}

POST places/test
{
   "id_product" : "C",
   "id_product_unique": 4,
   "location": {
      "lat": 1.218446,
      "lon": 102.763427
   },
   "suggest": ["coke","light"],
   "active": true
}

In my example there is 2 can of coke zero ("id_product_unique" = 1 and 2), 1 can of coke ("id_product_unique" = 3) and one can of coke light ("id_product_unique" = 4)

All these cans are in different locations.

An "id_product" is not unique as an exact same "can of coke" can be sold in two different locations (ex "id_product_unique" = 1 and 2).

Only "id_product_unique" and "location" change from a "can of coke" to an other one (2 same "can of coke" have the same fields "suggest" and "id_product" but not the same "id_product_unique" and "location").

My goal is to search for a product from a given GPS location, and display a unique result by id_product (the closest one):

POST /places/_search?size=0
{
  "aggs" : {
    "group-by-type" : {
      "terms" : { "field" : "id_product"},
      "aggs": {
        "min-distance": {
          "top_hits": {
            "sort": {
              "_script": { 
                "type": "number",
                "script": {
                  "source": "def x = doc['location'].lat; def y = doc['location'].lon; return Math.abs(x-1.178446) + Math.abs(y-101.763427)",
                  "lang": "painless"
                },
                "order": "asc"
              }
            },
            "size" : 1
          }
        }
      }
    }
  }
}

From this list of result I'd like now to apply a should query and to re-order my list of result by computed score. I tried the following:

POST /places/_search?size=0
{
  "query" : {
    "bool": {
      "filter": {"term" : { "active" : "true" }},
      "should": [
        {"match" : { "suggest" : "coke" }},
        {"match" : { "suggest" : "light" }}
      ]
    }
  },
  "aggs" : {
    "group-by-type" : {
      "terms" : { "field" : "id_product"},
      "aggs": {
        "min-distance": {
          "top_hits": {
            "sort": {
              "_script": { 
                "type": "number",
                "script": {
                  "source": "def x = doc['location'].lat; def y = doc['location'].lon; return Math.abs(x-1.178446) + Math.abs(y-101.763427)",
                  "lang": "painless"
                },
                "order": "asc"
              }
            },
            "size" : 1
          }
        }
      }
    }
  }
}

But I cannot figure how to replace the distance sort score by the doc score.

Any help would be great.

like image 723
woshitom Avatar asked Mar 18 '18 19:03

woshitom


1 Answers

I managed to do it by adding a new aggregation "max_score":

"max_score": {
  "max": {
    "script": {
      "lang": "painless",
      "source": "_score"
    }
  }
}

and by ordering by max_score.value desc:

"order": {"max_score.value": "desc"}

My final query is the following:

POST /places/_search?size=0
{
  "query" : {
    "bool": {
      "filter": {"term" : { "active" : "true" }},
      "should": [
        {"match" : { "suggest" : "coke" }},
        {"match" : { "suggest" : "light" }}
      ]
    }
  },
  "aggs" : {
    "group-by-type" : {
      "terms" : {
        "field" : "id_product",
          "order": {"max_score.value": "desc"}
      },
      "aggs": {
        "min-distance": {
          "top_hits": {
            "sort": {
              "_script": { 
                "type": "number",
                "script": {
                  "source": "def x = doc['location'].lat; def y = doc['location'].lon; return Math.abs(x-1.178446) + Math.abs(y-101.763427)",
                  "lang": "painless"
                },
                "order": "asc"
              }
            },
            "size" : 1
          }
        },
        "max_score": {
          "max": {
            "script": {
              "lang": "painless",
              "inline": "_score"
            }
          }
        }
      }
    }
  }
}

answer:

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "group-by-type": {
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        {
          "key": "C",
          "doc_count": 1,
          "max_score": {
            "value": 1.0300811529159546
          },
          "min-distance": {
            "hits": {
              "total": 1,
              "max_score": null,
              "hits": [
                {
                  "_index": "places",
                  "_type": "test",
                  "_id": "VhJdOmIBKhzTB9xcDvfk",
                  "_score": null,
                  "_source": {
                    "id_product": "C",
                    "id_product_unique": 4,
                    "location": {
                      "lat": 1.218446,
                      "lon": 102.763427
                    },
                    "suggest": [
                      "coke",
                      "light"
                    ],
                    "active": true
                  },
                  "sort": [
                    1.0399999646503995
                  ]
                }
              ]
            }
          }
        },
        {
          "key": "A",
          "doc_count": 2,
          "max_score": {
            "value": 0.28768208622932434
          },
          "min-distance": {
            "hits": {
              "total": 2,
              "max_score": null,
              "hits": [
                {
                  "_index": "places",
                  "_type": "test",
                  "_id": "UhJcOmIBKhzTB9xc6ve-",
                  "_score": null,
                  "_source": {
                    "id_product": "A",
                    "id_product_unique": 1,
                    "location": {
                      "lat": 1.378446,
                      "lon": 103.763427
                    },
                    "suggest": [
                      "coke",
                      "zero"
                    ],
                    "active": true
                  },
                  "sort": [
                    2.1999999592114756
                  ]
                }
              ]
            }
          }
        },
        {
          "key": "B",
          "doc_count": 1,
          "max_score": {
            "value": 0.1596570909023285
          },
          "min-distance": {
            "hits": {
              "total": 1,
              "max_score": null,
              "hits": [
                {
                  "_index": "places",
                  "_type": "test",
                  "_id": "VRJcOmIBKhzTB9xc_vc0",
                  "_score": null,
                  "_source": {
                    "id_product": "B",
                    "id_product_unique": 3,
                    "location": {
                      "lat": 1.478446,
                      "lon": 104.763427
                    },
                    "suggest": [
                      "coke"
                    ],
                    "active": true
                  },
                  "sort": [
                    3.2999999020282695
                  ]
                }
              ]
            }
          }
        }
      ]
    }
  }
}
like image 179
woshitom Avatar answered Oct 31 '22 10:10

woshitom