Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch completion suggester on multifield with different weighting

I'm using the Completion Suggester in Elasticsearch to allow partial word matching queries. In my index (products_index), I'd like to be able to query both the product_name field and the brand field. Here are my mappings:

POST /product_index

mappings: {
  products: {
    properties: {
      brand: {
        type: "string",
        analyzer: "english"
      },
      product_name: {
        type: "string",
        analyzer: "english"
      },
      id: {
        type: "long"
      },
      lookup_count: {
        type: "long"
      },
      suggest: {
        type: "completion",
        analyzer: "simple",
        payloads: true,
        preserve_separators: true,
        preserve_position_increments: true,
        max_input_length: 50
      },
      upc: {
        type: "string"
      }
    }
  }
}

Here is my data:

POST /product_index/products/2
{
  id: 2,
  brand: "Coca-Cola",
  product_name: "Classic Coke",
  suggest: {
    input: [
      "Classic Coke",
      "Coca-Cola"
    ],
    output: "Classic Coke - Coca-Cola",
    payload: {
      id: 2,
      product_name: "Classic Coke",
      brand: "Coca-Cola",
      popularity: 10
    },
    weight: 0
  }
}

And here is my query:

POST /product_index/_search

"suggest": {
  "product_suggest": {
    "text": 'coca-co',
    "completion": {
      "field": 'suggest'
    }
  }
}

This works great except that I'd like to give the product_name field a higher weighting than the brand field. Is there any way I can achieve this? I have looked into this article on using bool queries but I'm quite new to Elasticsearch and unsure how I can apply that in the case of completion suggester.

Thanks a lot!

like image 847
Harry Wang Avatar asked Feb 01 '15 21:02

Harry Wang


2 Answers

The Completion Suggester is actually pretty limited in term of scoring: you cannot do that. The only thing you can do is boost some entries but not the attributes inside an entry (see the weight options http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-suggesters-completion.html#indexing).

That's because the Completion Suggester doesn't do a "real search" -> it doesn't use the index. It's a simple "dictionary" designed to do "prefix" expansions faster than with an index + inverted lists.

You should give a try to Algolia -> the engine is designed to answer to prefix searches in realtime + having different "weights" per attribute. Here is a tutorial to implement auto-complete menu targeting several fields

like image 197
redox Avatar answered Oct 27 '22 21:10

redox


As redox said, the completion suggester is really simple and doesn't support entries boosting. My solution would be to make two suggester fields, one for brand and one for product name:

POST /product_index
{
  "mappings": {
    "products": {
      "properties": {
        "brand": {
          "type": "string",
          "analyzer": "english"
        },
        "product_name": {
          "type": "string",
          "analyzer": "english"
        },
        "id": {
          "type": "long"
        },
        "lookup_count": {
          "type": "long"
        },
        "product-suggest": {
          "type": "completion",
          "analyzer": "simple",
          "payloads": true,
          "preserve_separators": true,
          "preserve_position_increments": true,
          "max_input_length": 50
        },
        "brand-suggest": {
          "type": "completion",
          "analyzer": "simple",
          "payloads": true,
          "preserve_separators": true,
          "preserve_position_increments": true,
          "max_input_length": 50
        },
        "upc": {
          "type": "string"
        }
      }
    }
  }
}

When indexing, fill both fields:

POST /product_index/products/2
{
  "id": 2,
  "brand": "Coca-Cola",
  "product_name": "Classic Coke",
  "brand-suggest": {
    "input": [
      "Coca-Cola"
    ],
    "output": "Classic Coke - Coca-Cola",
    "payload": {
      "id": 2,
      "product_name": "Classic Coke",
      "brand": "Coca-Cola",
      "popularity": 10
    }
  },
  "product-suggest": {
    "input": [
      "Classic Coke"
    ],
    "output": "Classic Coke - Coca-Cola",
    "payload": {
      "id": 2,
      "product_name": "Classic Coke",
      "brand": "Coca-Cola",
      "popularity": 10
    }
  }
}

When querying, make one suggest on both the brand and product suggesters:

POST /product_index/_search
{
    "suggest": {
      "product_suggestion": {
        "text": "coca-co",
        "completion": {
          "field": "product-suggest"
        }
      },
      "brand_suggestion": {
        "text": "coca-co",
        "completion": {
          "field": "brand-suggest"
        }
      }
    }
}

You can append the list of suggestions of brand-suggestion to the one of product suggestion, after having removed the duplicates, to have list of suggestions with only relevant suggestions, no duplicates and the product suggestions first.

Another solution would be to use a query with boosting on brand and product, instead of using suggesters. This implementation is however slower since it doesn't use suggesters.

like image 21
Heschoon Avatar answered Oct 27 '22 21:10

Heschoon