I rephrased my problem into a full curl recreation script. That way it might be easier to reproduce the problem (search fails with custom analyzer). I am using the latest ES version for this
curl -XDELETE "http://localhost:9200/test_shingling"
curl -XPOST "http://localhost:9200/test_shingling/" -d '{
"settings": {
"index": {
"number_of_shards": 10,
"number_of_replicas": 1
},
"analysis": {
"analyzer": {
"ShingleAnalyzer": {
"tokenizer": "BreadcrumbPatternAnalyzer",
"filter": [
"standard",
"lowercase",
"filter_stemmer",
"filter_shingle"
]
}
},
"filter": {
"filter_shingle": {
"type": "shingle",
"max_shingle_size": 2,
"min_shingle_size": 2,
"output_unigrams": false
},
"filter_stemmer": {
"type": "porter_stem",
"language": "English"
}
},
"tokenizer": {
"BreadcrumbPatternAnalyzer": {
"type": "pattern",
"pattern": " |\\$\\$\\$"
}
}
}
}
}'
curl -XPOST "http://localhost:9200/test_shingling/item/_mapping" -d '{
"item": {
"properties": {
"Title": {
"type": "string",
"search_analyzer": "ShingleAnalyzer",
"index_analyzer": "ShingleAnalyzer"
}
}
}
}'
curl -XPOST "http://localhost:9200/test_shingling/item/" -d '{
"Title":"Kyocera Solar Panel Test"
}'
curl 'localhost:9200/test_shingling/_analyze?pretty=1&analyzer=ShingleAnalyzer' -d 'Kyocera Solar Panel Test'
curl -XPOST "http://localhost:9200/test_shingling/_refresh"
curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{
"query": {
"term": {
"Title": "Kyocera Solar Panel Test"
}
}
}'
curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{
"query": {
"term": {
"Title": "Kyocera Solar Panel Test"
}
}
}'
curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{
"query": {
"query_string": {
"default_field": "Title",
"query": "Kyocera Solar Panel Test"
}
}
}'
curl -XPOST "http://localhost:9200/test_shingling/item/_search?pretty=true" -d '{
"query": {
"query_string": {
"default_field": "Title",
"query": "solar panel"
}
}
}'
The term query will search for an exact match and won't apply ShingleAnalyzer to your query.
So you have to use the match query, this will apply the Analyzer to your query string when searching.
Whole word search
curl -XPOST "http://localhost:9200/test_shingling/item/_search" -d'{
"query": {
"match": {
"Title": "Kyocera Solar Panel Test"
}
}
}'
Partial Word search
curl -XPOST "http://localhost:9200/test_shingling/item/_search" -d'{
"query": {
"match": {
"Title": "Panel Test"
}
}
}'
Another Partial word search
curl -XPOST "http://localhost:9200/test_shingling/item/_search" -d'{
"query": {
"match": {
"Title": "Solar Panel Test"
}
}
}'
Hope it helps..!
I think that the search query_string
considers solar panel
as solar
or panel
by default and that you have to set it explicitly in the query_string
. This is what's written in the reference guide.
default_operator :
The default operator used if no explicit operator is specified. For example, with a default operator of OR, the query capital of Hungary is translated to capital OR of OR Hungary, and with default operator of AND, the same query is translated to capital AND of AND Hungary. The default value is OR.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With