Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between Weight and boost in Elasticsearch

I read about boosting in Elasticsearch. We can apply boosting at index or query time. Index time boosting is sort of static boosting and not suggested. Query time boosting is dynamic in nature. Query time boosting is good and preferred approach.

We can also add boosting to the fields. For example we are searching a term in multiple fields. We can boost a field to change the score of the document.

{
   "match":{"title":{"query":"test string","boost":10}}
},

I read about weight.

{
     "filter": { "match": { "test": "cat" } },
     "weight": 42
}

My understanding is weight applied on the fields in order to change the relevancy or score. Boost is applied to queries in order to change the relevancy or score.

But I am not sure about the difference in weight and boost.

Could someone correct me in understanding the difference between weight and boost with some example?

like image 558
Abhijit Bashetti Avatar asked May 22 '20 13:05

Abhijit Bashetti


People also ask

What is Elasticsearch boost?

Boost values are relative to the default value of 1.0. A boost value between 0 and 1.0 decreases the relevance score. A value greater than 1.0 increases the relevance score. The boost value here will increase the relevance score of the document if the query match the document.

Which is used to improve the performance of Elasticsearch?

ElasticSearch is built with an open-source Lucene for high performance. The open-source Apache Lucene is made with Java, ElasticSearch internally uses Apache Lucene for indexing and searching.

What is Score_mode in Elasticsearch?

In case score_mode is set to avg the individual scores will be combined by a weighted average. For example, if two functions return score 1 and 2 and their respective weights are 3 and 4, then their scores will be combined as (1*3+2*4)/(3+4) and not (1*3+2*4)/2.


2 Answers

1. Relevance Tuning in Elastic Search is changing how fields are weighted against one another or boosting relevance given a value within a field. Note: You must have at least two schema fields to tune relevance.

2. Weights are applied to fields. Boosts are set-up on top of fields, but they are applied to field values.

3. Weights

  • Each field has a possible weight of 0 to 10, 10 being the most substantial weight.

    curl -X GET 'https://host-2376rb.api.swiftype.com/api/as/v1/engines/national-parks-demo/search'
    -H 'Content-Type: application/json'
    -H 'Authorization: Bearer search-soaewu2ye6uc45dr8mcd54v8'
    -d '{ "search_fields": { "title": { "weight": 10 }, "description": { "weight": 1 }, "states": { "weight": 2 } }, "query": "mountains" }'

Here we are asking to return only three fields within our results: title, description, and states. We are weighting each field: 10, 1, and 2, respectively.

4. Boosts

  • There are 4 kinds of Boosts. Use a boost to increase relevance.

    { "query" : { "dis_max" : { "queries" : [ { "match" : { "qualifications" : { "query" : "social media", "boost" : 2 } } }, { "match" : { "cover_letter" : "social media" } } ] } } }

Here we add a boost to a disjunction max query for job candidates who list "social media" in their resume. This example boosts the query match by 2 in the field "qualifications" compared to nothing being boosted if "social media" shows up in the cover_letter field.

In these examples, the weights are hard fixed to the fields no matter what, where the boost is hard fixed on the query "social media" but only if it matches inside the qualifications field. Hope this helps..

  1. Elastic Search doc on the matter
  2. A really good Boosting example including negative_boost, field_value_factor, and date based weight decay
  3. Other very useful links on the matter boost_1, boost_2, boost_3, boost_4, weight_1
like image 132
Chris Avatar answered Sep 19 '22 07:09

Chris


Fields are weighted against one another where as boosting is based on a given value within a field.

Weights: Each field has a possible weight of 0 to 10, 10 being the most substantial weight. For Eg: If we want people to find the page they are looking for based on the query giving more importance to title, so we need to prioritize the title field. We can increase its weight so that it is more impactful than the other fields. If title had higher weight, people would find the document where this page is present in title at the top.

{
  "search_fields":{ 
    "title": { 
      "weight": 10 
    }, 
    "subtitle": { 
      "weight": 5 
    }, 
    "description": { 
      "weight": 2 
    } 
  }, 
  "query": "Elastic" 
}

Here we are requesting elastic only to return 3 field i.e title, subtitle, description with the corresponding weights as 10,5,2.

Boosts: Weights are applied to fields. Boosts are set-up on top of fields, but they are applied to field values. When boosting on number, date, or geolocation fields, you will need to define a function parameter and a factor. There are four types of function, depending on the boost: linear, exponential, gaussian, and logarithmic. The function and factor are used to compute half of the boosted relevance score, known as the boost value. The other half is the original document score. They combine to produce the overall document score, which governs the order of the result set

{
  "query": "Elastic", 
  "boosts": {
    "is_elastic_query": [ 
      { 
        "type": "value", 
        "value": "true", 
        "operation": "multiply", 
        "factor": 10 
      } 
      ] 
  } 
}

Here we are assuming that is_elastic_query is a field with value either true of false. And we are boost it using value boost by factor 10 if the value is true.

For details and examples on this please find the below link:

https://www.elastic.co/guide/en/app-search/current/relevance-tuning-guide.html

like image 44
Yugank Pant Avatar answered Sep 22 '22 07:09

Yugank Pant