Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch wildcard search and relevance

I am trying to implement wildcard for a suggestion dropdown. I have a few days already since I try to figure out this. :(

I have a list of restaurants (4000-7000). I want to search with wildcard in restaurant names and to display first the results where search is in front of text.

I tried to index the name field without analyzer, with ngram analyzer and many other solutions I found on the net but without luck.

Best results by now I get by with this setup:

settings:
  analysis: {
    analyzer: {
      default: {
        tokenizer: :keyword, 
        filter: [:lowercase]
      }
    }
  }

And index name field like this:

indexes :name, type: :string, analyzer: :default

Search: query: {wildcard: {name: '*le*'}}
Result: Mr. Beef on Orleans, Miller's Pub, Merlo on Maple, Le Bouchon, Les Nomades, Leonardo's Ristorante, Lem's Bar-B-Q House, Le Petit Paris, Joy Yee's Noodles - Chinatown, J. Alexander's (Lincoln Park), Indian Garden - Streeterville, Goose Island Brewpub - Wrigleyville, Tweet ... Let's Eat!, Arco de Cuchilleros, Al's #1 Italian Beef - Little Italy

I want that the results that start with 'le' to be in front, to have a higher score. Because usually the people search for a restaurant that starts with. But I can not search without * in front because I do want also the results that contain this but with lower score in the results. For example above 'Le Colonial', 'Le Petit Paris', 'Les Nomades' should be in front.

How can I accomplish this?

The other concern I have it's performance. I know that wildcard in booth ends it's the worst case possible but I could not find any solution that gives me something ok in result with ngram or shingle.

like image 497
silviu.rosu Avatar asked Apr 21 '14 10:04

silviu.rosu


1 Answers

Use boost to pick the first match on top.

Using two wildcard query

curl -XPOST "http://hostname:9200/index/type/_search" -d'
{
"size": 2000,
"query": {
    "bool": {
        "should": [
            {
                "wildcard": {
                    "name": {
                        "value": "*le*"
                    }
                }
            },
            {
                "wildcard": {
                    "name": {
                        "value": "le*",
                        "boost": 5
                    }
                }
            }
        ]
    }
}
}'

Using one wildcard and one prefixquery

curl -XPOST "http://hostname:9200/index/type/_search" -d'
{
"size": 2000,
"query": {
    "bool": {
        "should": [
            {
                "wildcard": {
                    "name": {
                        "value": "*le*"
                    }
                }
            },
            {
                "prefix": {
                    "name": {
                        "value": "le",
                        "boost": 2
                    }
                }
            }
        ]
    }
}
}'
like image 164
BlackPOP Avatar answered Oct 05 '22 01:10

BlackPOP