Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch -- use distance from point to affect query relevance

Trying to use ElasticSearch to create a search that uses distance from a centerpoint to influence relevance.

I don't want to simply sort on distance from a point, which I know is possible, because I want relevance based on the searched query to also affect results.

I'd like to pass in a search string, say "coffee", and a lat/lon, say "38, -77", and get my results ordered by a combination of how related they are to "coffee" and how close they are to "38, -77".

Thanks!

like image 528
Clay Wardell Avatar asked Oct 24 '12 16:10

Clay Wardell


People also ask

What is range query in Elasticsearch?

Range Queries in Elasticsearch Combining the greater than ( gt ) and less than ( lt ) range parameters is an effective way to search for documents that contain a certain field value within a range where you know the upper and lower bounds.

What is Elasticsearch relevance score?

The score represents how relevant a given document is for a specific query. The default scoring algorithm used by Elasticsearch is BM25.

How does Elasticsearch calculate score?

Elasticsearch uses two kinds of similarity scoring function: TF-IDF before version 5.0 and Okapi BM25 after. TF-IDF measures how much a word is common locally and rare globally to determine how much relevant a query is.

How do I change Elasticsearch score?

According to your comment, you need the _score to be multiplied by the document's score field. You can achieve it simply by removing the boost_mode parameter, the default boost_mode is to multiply the _score with whatever value comes out of the field_value_factor function.


1 Answers

The recently (0.90.4) added function_score query type adds support for ranking based on distance. This is an alternative to the custom score query type.

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html

An example lifted from there:

"query": {
  "function_score": {
    "functions": [
      { "gauss":  { "loc":   { "origin": "51,0", "scale": "5km" }}},
    ]
  }
}

This applies a decay function (there are several) to a field ("loc") that scores against the distance from an origin given a particular scale. This is exactly what you'd want for distance ranking since it gives you a lot of flexibility to how it should rank without writing custom scripts.

like image 56
Jilles van Gurp Avatar answered Oct 14 '22 04:10

Jilles van Gurp