Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Boosting field prefix match in Elasticsearch

Is there a way to boost the scores of prefix field matches over term matches later in the field? Most of the Elasticsearch/Lucene documentation seems to focus on terms rather than fields.

For example, when searching for femal*, I'd like to have Female rank higher than Microscopic examination of specimen from female. Is there a way to do this on the query side or would I need to do something like create a separate field consisting of the the first word?

like image 463
Tom Morris Avatar asked Dec 23 '13 17:12

Tom Morris


People also ask

How does boost work in Elasticsearch?

Returns documents matching a positive query while reducing the relevance score of documents that also match a negative query. You can use the boosting query to demote certain documents without excluding them from the search results.

What is Elasticsearch index prefix?

Elasticsearch Index Prefix parameter used to the indexing of search term prefixes to speed up prefix searches on a website. You can set Elasticsearch Index Prefix from the admin panel.

What is Match_phrase_prefix?

The match_phrase_prefix query analyzes any provided text into tokens before performing a search. The last term of this text is treated as a prefix, matching any words that begin with that term. analyzer. (Optional, string) Analyzer used to convert text in the query value into tokens.

What is term query in Elasticsearch?

Returns documents that contain an exact term in a provided field. You can use the term query to find documents based on a precise value such as a price, a product ID, or a username. Avoid using the term query for text fields. By default, Elasticsearch changes the values of text fields as part of analysis.


1 Answers

To do this, you could e.g. use a bool-query with a should to weigh in a span_first-query which in turn has a span_multi

Here is a runnable example you can play with: https://www.found.no/play/gist/8107157

#!/bin/bash

export ELASTICSEARCH_ENDPOINT="http://localhost:9200"

# Index documents
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_bulk?refresh=true" -d '
{"index":{"_index":"play","_type":"type"}}
{"title":"Female"}
{"index":{"_index":"play","_type":"type"}}
{"title":"Female specimen"}
{"index":{"_index":"play","_type":"type"}}
{"title":"Microscopic examination of specimen from female"}
'

# Do searches

# This will match all documents.
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
    "query": {
        "prefix": {
            "title": {
                "prefix": "femal"
            }
        }
    }
}
'

# This matches only the two first documents.
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
    "query": {
        "span_first": {
            "end": 1,
            "match": {
                "span_multi": {
                    "match": {
                        "prefix": {
                            "title": {
                                "prefix": "femal"
                            }
                        }
                    }
                }
            }
        }
    }
}
'

# This matches all, but prefers the one's with a prefix match.
# It's sufficient that either of these match, but prefer that both matches.
curl -XPOST "$ELASTICSEARCH_ENDPOINT/_search?pretty" -d '
{
    "query": {
        "bool": {
            "should": [
                {
                    "span_first": {
                        "end": 1,
                        "match": {
                            "span_multi": {
                                "match": {
                                    "prefix": {
                                        "title": {
                                            "prefix": "femal"
                                        }
                                    }
                                }
                            }
                        }
                    }
                },
                {
                    "match": {
                        "title": {
                            "query": "femal"
                        }
                    }
                }
            ]
        }
    }
}
'
like image 87
Alex Brasetvik Avatar answered Oct 16 '22 13:10

Alex Brasetvik