We have an analyzer which includes the synonym filter which is defined as follows:
synonym_filter :
type : synonym
synonyms_path : synonyms.txt
ignore_case : true
expand : true
format : solr
In the synonym file we have a synonym defined as follows:
dawdle,waste time
Then in our data we have an entity with a name field "dawdle company".
Because of the synonym filter this gets analyzed to something like:
1 -dawdle- 2 -company- 3
1 -wasted- 2 -time- 3
With time and company in the same position. Then when performing a search for "wasted time" we get a hit in this entity. We would like the highlights to be "dawdle" since that is the equivalent synonym, but it seems elastic search sees this as a two hits since it matched "wasted" and "time" and it returns two highlights: "dawdle" and "company".
Is there a recommended way to solve these kind of issues where an unexpected word is returned in the highlights because it occupies the same position of a search term that was inserted because of a synonym?
@SergeyS the situation both you and @user2430530 has is perfectly described in this section of the documentation.
And the suggestion there is to try and define a single term for each serie of synonyms not to get back that mix up of terms highlighted in the result.
Something like this:
"analysis": {
"analyzer": {
"synonym": {
"tokenizer": "whitespace",
"filter": [
"synonym"
]
}
},
"filter": {
"synonym": {
"type": "synonym",
"synonyms": [
"dawdle, waste time=>waste_time"
]
}
}
}
Then you'll get the desired result from ES:
"highlight": {
"text": [
"some <em>dawdle</em> company"
]
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With