I have simple documents with a field called "keywords", which is indexed for text search. The keywords are an array of words and short phrases, like this:
{"keywords": ["restaurant manager", "chef", "bus boy"]}
The query must contain all of the words in at least one item in a doc's keywords for that doc to be returned.
Examples:
"manager" should not return this doc.
"bus" and "manager" should not return this doc.
"restaurant manager" should return this doc.
"chef" should return this doc.
"restaurant manager chef" should return this doc and have a higher score.
"restaurant manager unrelated words" should return this doc.
"restaurant manager bus" should return this doc but, ideally, should not have a higher score than "restaurant manager".
The scoring is important, so I need to make it a query and not a filter.
I'm using Elasticsearch 1.7.
This can be achieved with following setup.
POST your_index
{
"settings": {
"analysis": {
"analyzer": {
"keyword_analyzer": {
"type": "custom",
"filter": [
"lowercase"
],
"tokenizer": "keyword"
},
"shingle_analyzer":{
"type" : "custom",
"filter" :["lowercase","shingle_filter"],
"tokenizer" : "standard"
}
},
"filter": {
"shingle_filter": {
"type": "shingle",
"min_shingle_size": 2,
"max_shingle_size": 5
}
}
}
},
"mappings": {
"your_type": {
"properties": {
"keywords": {
"type": "string",
"index_analyzer": "keyword_analyzer",
"search_analyzer": "shingle_analyzer"
}
}
}
}
}
Here I am using two different analyzers, one for indexing and one for searching because of the requirements. keyword_analyzer is needed to index the term as it is so that query for manager does not return document with restaurant manager. More on keyword analyzer. Now search needs shingle filter to generate phrases from the input text. Text like This restaurant manager is kind will be split into This restaurant, restaurant manager, manager is etc and you will get the desired results. You can use the analyze api to see how analyzer works.
You index the document like this
PUT your_index/your_type/1
{
"keywords": ["restaurant manager", "chef", "bus boy"]
}
and this type of query will give you documents back
GET your_index/_search
{
"query": {
"match": {
"keywords": "This restaurant manager is also a good chef"
}
}
}
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With