Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ElasticSearch - cross_fields multi match with fuzzy search

Tags:

I have documents that represent users. They have fields name and surname.

Let's say I have two users indexed - Michael Jackson and Michael Starr. I want these sample searches to work:

  1. Michael => { Michael Jackson, Michael Starr }
  2. Jack Mich => { Michael Jackson } (incomplete words and reversed order)
  3. Michal Star => { Michael Starr } (fuzzy search)

I tried different queries and got the best results from multi_match query with cross_fields type. There are 2 problems though:

  1. It only finds something when at least one of the two words is complete. If I type Jackson Mich, it finds Michael Jackson but if I type Jack Mich, it doens't find anything (but I want it to find it).
  2. It cannot be set to fuzzy search. I really need the fuzzy search but keep the quality of multi_match with cross_fields type.

In other words, I want to implement Facebook-like people searching.

I'm pretty new to ElasticSearch so maybe I'm missing something obvious. Sorry if I do.

like image 247
Michal Artazov Avatar asked May 28 '14 15:05

Michal Artazov


People also ask

Does Elasticsearch do fuzzy matching?

In Elasticsearch, fuzzy query means the terms are not the exact matches of the index. The result is 2, but you can use fuzziness to find the correct word for a typo in Elasticsearch's fuzzy in Match Query. For 6 characters, the Elasticsearch by default will allow 2 edit distance.

What is multi match query in Elasticsearch?

The multi_match query provides a convenient shorthand way of running the same query against multiple fields.

What is Multimatch?

Multi-Match® Multi-Match is a lotto-style game. For just $2.00, you get to play 18 numbers with four easy ways to match and win. When you play Multi-Match, for each game you play, you will be able to select your first line of six numbers or you can choose Quick Pick.

What is fuzzy query in Elasticsearch?

Fuzzy queryedit. Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance. An edit distance is the number of one-character changes needed to turn one term into another.


1 Answers

Jack Mich type of searches

  • Make sure when you are querying use OR and not AND e.g. Jack OR Mich
  • Also essentially you want to do a partial matching on the fields. For this you need to enable nGrams on these fields (do this in mapping) so that index have matches for partial words

You are using the correct query type. These two should solve your problems.

PS: We all are learning here, doing that together is fun :)

like image 183
SarZ Avatar answered Sep 23 '22 07:09

SarZ