Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boost for a Bool query on Elasticsearch having little effect

What is currently confusing me is that in the query I add a boost to category_id of 10 which is much higher than the other boosts. An item from another category,"Tai Chi", somehow arrives at the top of the results.

I have a mapping of:

{
  "the_items": {
    "item": {
      "properties": {
        "brand_id": {
          "type": "integer"
        },
        "category_id": {
          "type": "integer"
        },
        "description": {
          "type": "multi_field",
          "fields": {
            "description": {
              "type": "string",
              "analyzer": "full_title"
            }
          }
        },
        "title": {
          "type": "multi_field",
          "fields": {
            "title": {
              "type": "string",
              "analyzer": "full_title"
            },
            "partial_title": {
              "type": "string",
              "index_analyzer": "partial_title",
              "search_analyzer": "full_title",
              "include_in_all": false
            }
          }
        },
        "updated_at": {
          "type": "string"
        }
      }
    }
  }
}

I am running the following query:

curl -XGET 'http://localhost:9200/austin_items/_search?pretty=true' -d '{
  "query": {
    "filtered": {
      "query": {
        "bool": {
          "should": [
            {
              "match": {
                "title": {
                  "boost": 2,
                  "query": "chi",
                  "type": "phrase"
                }
              }
            },
            {
              "match": {
                "title.partial_title": {
                  "boost": 1,
                  "query": "chic"
                }
              }
            },
            {
              "match": {
                "description": {
                  "boost": 0.2,
                  "query": "chic"
                }
              }
            },
            {
              "term": {
                "category_id": {
                  "boost": 10,
                  "value": 496
                }
              }
            }
          ]
        }
      }
    }
  }
}'

That gives me the following hits:

[
  {
    "_index": "the_items",
    "_type": "item",
    "_id": "34410",
    "_score": 0.7510745,
    "_source": {
      "id": "34410",
      "title": "Initiez-vous au Tai Chi",
      "description": "p. Le Tai Chi est un art chevaleresque, initialement originaire de Chine, maintenant partie int\u00e9grante des tr\u00e9sors du patrimoine de l'humanit\u00e9. C'est un art de droiture, un art pour les braves, \u00e0 la recherche du geste juste et de l'attitude juste - la \"ju",
      "brand_id": "0",
      "category_id": "497"
    }
  },
  {
    "_index": "the_items",
    "_type": "item",
    "_id": "45393",
    "_score": 0.45193857,
    "_source": {
      "id": "45393",
      "title": "Very Hot Chicken",
      "description": "Laissez-vous tenter par la force du Very Hot Chicken Burger, avec sa sauce piment\u00e9e, ses rondelles de piment vert et sa pr\u00e9paration pan\u00e9e au poulet.\r\nAjoutez-y une tranche de chester fondu, de la salade, des tomates, le tout dans un pain parsem\u00e9 de bl\u00e9 concass\u00e9 pour un burger fort en go\u00fbt !",
      "brand_id": "0",
      "category_id": "496"
    }
  }
]

If I boost the category_id field to something silly like 30 then it knocks "Tai Chi" out of the top results. I do actually want "Thai Chi" to appear in the search results in case there is nothing else but it seems like for some reason unknown to me the category_id part of the query is not working correctly. Does anyone know why this is happening?

like image 500
unflores Avatar asked Mar 01 '13 10:03

unflores


People also ask

How does bool query work in Elasticsearch?

Boolean, or a bool query in Elasticsearch, is a type of search that allows you to combine conditions using Boolean conditions. Elasticsearch will search the document in the specified index and return all the records matching the combination of Boolean clauses.

What is Elasticsearch query boost?

Returns documents matching a positive query while reducing the relevance score of documents that also match a negative query. You can use the boosting query to demote certain documents without excluding them from the search results.


1 Answers

I was looking to modify the score based on a boost that I added to the query. However, the score takes a bunch of things into account not just the boost. In order to enforce a boost for category and brand, I ended up using the "custom_boost_factor" and applying it as a subquery tacked on to the regular should cases.

curl -XGET 'http://localhost:9200/austin_items/_search?pretty=true' -d '
{
  "query" : {
    "filtered" : {
      "query" : {
        "bool" : {
          "should" : [
            { "match" : { "title" : { "boost" : 2,"query" : "chi", "type":"phrase"} } },
            { "match" : { "title.partial_title" : { "boost" : 1,"query" : "chi"} } },
            { "match" : { "description" : { "boost" : 0.2,"query" : "chic"} } },
            { "custom_boost_factor": { 
                "query":{
                  "bool": {
                    "must" : [
                      { "multi_match": { "query" : "chi", "fields" : ["title", "description"] }},
                      { "in": { "category_id": [496] } }
                    ]
                  }
                },
                "boost_factor": 2 
              }
            },
            { "custom_boost_factor": { 
                "query":{
                  "bool": {
                    "must" : [
                      { "multi_match": { "query" : "chi", "fields" : ["title", "description"] }},
                      { "in": { "brand_id": [999] } }
                    ]
                  }
                },
                "boost_factor": 3
              }
            }
          ]
        }
      }
    }
  }
}'
like image 77
unflores Avatar answered Oct 19 '22 12:10

unflores