Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch - give negative boost to documents without a certain field

I'm working on a query, the basic filtered multi match query is working as planned, it returns the documents i want.

The issue is that a want to boost results which have a certain string field with ex. 0.5, or in this example give results which don't have this field 'traded_as' a negative boost of 1.0.

Cannot get the filter - boost - must - exists/missing to work as i want.

It this the correct approach on this issue?

Using elasticsearch 1.5.2

{
"query": {
    "filtered": {
        "query": {
           "multi_match": {
               "query": "something",
               "fields": ["title", "url", "description"]
           }
        },
        "filter": {
           "bool": {
                "must": {
                    "missing": {
                        "field": "marked_for_deletion"
                    }
                }
            }
        }
    }
},
"boosting": {
    "positive": {
        "filter": {
            "bool": {
                "must": {
                    "exists": {
                        "field": "traded_as"                            
                    }
                }
            }
        }
    },
    "negative": {
        "filter": {
           "bool": {
                "must": {
                    "missing": {
                        "field": "traded_as"
                    }
                }
            }
        }
    },
    "negative_boost": 1.0
}
}
like image 266
HasseWilson Avatar asked May 07 '15 09:05

HasseWilson


People also ask

What is _score in Elasticsearch?

The _score in Elasticsearch is a way of determining how relevant a match is to the query. The default scoring function used by Elasticsearch is actually the default built in to Lucene which is what Elasticsearch runs under the hood.

What does bool mean in Elasticsearch?

Boolean, or a bool query in Elasticsearch, is a type of search that allows you to combine conditions using Boolean conditions. Elasticsearch will search the document in the specified index and return all the records matching the combination of Boolean clauses.

What is Function_score Elasticsearch?

The function_score allows you to modify the score of documents that are retrieved by a query. This can be useful if, for example, a score function is computationally expensive and it is sufficient to compute the score on a filtered set of documents.

How does boost work in Elasticsearch?

Returns documents matching a positive query while reducing the relevance score of documents that also match a negative query. You can use the boosting query to demote certain documents without excluding them from the search results.


1 Answers

You cannot have the desired result. As stated in the doc for boosting query :

Unlike the "NOT" clause in bool query, this still selects documents that contain undesirable terms, but reduces their overall score.

{
  "query": {
    "boosting": {
      "positive": [{
        "filtered": {
          "query": {
            "multi_match": {
              "query": "something",
              "fields": ["title", "url", "description"]
            }
          },
          "filter": {
            "bool": {
              "must": [{
                "missing": {
                  "field": "marked_for_deletion"
                }
              }]
            }
          }
        }
      }],
      "negative": [{
        "filtered": {
          "filter": {
            "missing": {
              "field": "traded_as"
            }
          }
        }
      }],
      "negative_boost": 1.0
    }
  }
}

So you'll still have some irrelevant documents, but matching documents will have a better score. You won't have any boost on traded_as presence that way. For this you should have a look at function score http://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-function-score-query.html#_using_function_score

You would have something like

{
  "query": {
    "function_score": {
      "query": {
        "filtered": {
          "query": {
            "multi_match": {
              "query": "something",
              "fields": ["title", "url", "description"]
            }
          },
          "filter": {
            "bool": {
              "must": {
                "missing": {
                  "field": "marked_for_deletion"
                }
              }
            }
          }
        }
      },
      "functions": [{
        "filter": {
          "exists": {
            "field": "traded_as"
          }
        },
        "boost_factor": 2
      }, {
        "filter": {
          "missing": {
            "field": "traded_as"
          }
        },
        "boost_factor": 0.5
      }],
      "score_mode": "first",
      "boost_mode": "multiply"
    }
  }
}
like image 84
Julien C. Avatar answered Oct 17 '22 19:10

Julien C.