Searchkick

Question

I'm using Searchkich on Rails 5 app.

In my search_data for model Part I have string fields that contain dots (.) and hyphens (-). I would like to make a literal search for those fields using dots and hyphens in query string. I am using word_start match.

When my query string looks like this: 66.6 it works OK (it finds all records with queried field starting with 66.6).

However if dot (or other special character) is trailing (ie. 66. or 66- or even 66.---.-.---) it behaves like the query string is just 66. It seems like anything after "normal" characters (letters and digits) is being trimmed.

My search looks like this:

Part.search "66.", fields: [:catalogue_number], misspellings: false, match: :word_start

What is the possible solution to this?

EDIT:

Ok, I broke it down and it seems that dots and hyphens are two separate problems.

Dots in query string seem to behave as described above - if the dot is followed by any "normal" character search works as expected. However trailing dots seem to be ignored.
Hyphens in the middle of the query string behave like whitespaces - they divide query string to different strings (afterwards connected with operator and). Trailing hyphens seem to be ignored (like dots).

What I need is for both dots and hyphens to behave literally wherever they are in the query string.

Pierre Mallet · Accepted Answer

The word_start analyzer of searchkick use this ES configuration ( source here )

searchkick_word_start_index: {
    type: "custom",
    tokenizer: "standard",
    filter: ["lowercase", "asciifolding", "searchkick_edge_ngram"]
}

It uses the standard tokenizer that split strings on hyphens and dots ( there are other rules used by standard tokenizer, but non-relevant to your case ) ( doc here )

You should try with the text_start match of searchkick that use this configuration

searchkick_text_start_index: {
    type: "custom",
    tokenizer: "keyword",
    filter: ["lowercase", "asciifolding", "searchkick_edge_ngram"]
}

The Elastic keyword tokenizer will preserve the "." and "-" and should work for your use case.

NB: A think that the working matching on 66.6 is a fluke since standard analyzer also strips the "."

Searchkick - trailing special characters

Tags:

ruby-on-rails

elasticsearch

glizda101

1 Answers

Pierre Mallet

Recent Activity

Donate For Us