Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch Suggest+Synonyms+fuzziness

I am looking for a way to implement the auto-suggest with synonyms & fuzziness

For example, when the user tried to search for "replce ar" My synonym list has ar => audio record

So, the result should include the items matching changing audio record replacing audio record etc..,

Here we need fuzziness because there is a typo on "replace" (in the user's search text) Synonyms to match ar => audio record Auto-suggest with regex pattern.

Is it possible to implement all the three features in a single field?

Edit: a regex+fuzzy just throws error. I haven't well explained my need of a regex-pattern. so, i needed a Regex for doing a partial word lookup ('encyclopedic' contains 'cyclo').

now, after investigating what options do i have for this purpose, directing me to the NGram Tokenizer and looking into the other suggesters, i found that maybe Phrase suggester is realy what I'm looking for, so I'll try it & tell you about.

like image 513
shmuel friedman Avatar asked Nov 07 '22 20:11

shmuel friedman


1 Answers

Yes, you can use synonyms as well as fuzziness for suggestions. The synonyms are handled by adding a synonym filter in your language analyzer and adding that filter to the analyzer. Then, when you create the field mapping for the field(s) you want to use for suggestions, you assign that analyzer to that field.

As for fuzziness, that happens at query time. Most text-based queries support a fuzziness option which allows you to specify how many corrections you want to allow. The default auto value adjusts the number of corrections, depending on how long the term is, so that's usually best.

Notional analysis setup (synonym_graph reference)

{
  "analysis": {
    "filter": {
      "synonyms": {
        "type": "synonym_graph",
        "expand": "false",
        "synonyms": [
          "ar => audio record"
        ]
      }
    },
    "analyzer": {
      "synonyms": {
        "tokenizer": "standard",
        "type": "custom",
        "filter": [
          "standard",
          "lowercase",
          "synonyms"
        ]
      }
    }
  }
}

Notional Field Mapping (Analyzer + Mapping reference)

(Note that the analyzer matches the name of the analyzer defined above)

{
  "properties": {
    "suggestion": {
      "type": "text",
      "analyzer": "synonyms"
    }
  }
}

Notional Query

{
  "query": {
    "match": {
      "suggestion": {
        "query": "replce ar",
        "fuzziness": "auto",
        "operator": "and"
      }
    }
  }
}

Keep in mind that there are several different options for suggestions, so depending on which option you use, you may need to adjust the way the field is mapped, or even add another token filter to the analyzer. But analyzers are just made up of a series of token filters, so you can usually combine whatever token filters you need to achieve your goal. Just make sure you understand what each filter is doing so you get the filters in the correct order.

If you get stuck in part of this process, just submit another question with the specific issue you're running into. Good luck!

like image 177
dmbaughman Avatar answered Nov 15 '22 08:11

dmbaughman