I need to use the prefix
filter, but allow multiple different prefixes, i.e.
{"prefix": {"myColumn": ["This", "orThis", "orEvenThis"]}}
This does not work. And if I add each as a separate prefix
is also obviously doesn't work.
Help is appreciated.
Update
I tried should
but without any luck:
$this->dsl['body']['query']['bool']['should'] = [
["prefix" => ["myColumn" => "This"]],
["prefix" => ["myColumn" => "orThis"]]
];
When I add those two constraints, I get ALL responses (as though filter is not working). But if I use must
with either of those clauses, then I do get a response back with the correct prefix.
If enabled, Elasticsearch indexes prefixes between 2 and 5 characters in a separate field. This lets Elasticsearch run prefix queries more efficiently at the cost of a larger index. Prefix queries will not be executed if search.allow_expensive_queries is set to false.
Elasticsearch won’t analyze Keyword data types, which means the String that you index will stay as it is. So, with the example above, what would the string looks like in the Inverted Index? Yes, you’re right, it’s exactly as you write.
If you query a full-text (analyzed) field, Elasticsearch first pass the query string through the defined analyzer to produce the list of terms to be queried.
The Match Phrase Prefix Query is a full-text query. If you query a full-text (analyzed) field, Elasticsearch first pass the query string through the defined analyzer to produce the list of terms to be queried.
Based on your comments, it sounds like it may just be an issue with the syntax. With all ES queries (just like SQL ones), I suggest starting simple and just submitting them to ES as the raw DSL outside of code (although in your case this wasn't easily doable). For the request, it's a pretty straight forward one:
{
"query" : {
"bool" : {
"must" : [ ... ],
"filter" : [
{
"bool" : {
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
]
}
}
}
I added it as a filter
because the optional nature of your prefixing is not improving relevancy: it's literally asking that one of them must match. In such cases where the question is "does this match? yes / no", then you should use a filter (with the added bonus that that's cacheable!). If you're asking "does this match, and which matches better?" then you want a query (because that's relevancy / scoring).
Note: The initial issue appeared to be that the bool
/ must
was unmentioned and the suggestion was to just use a bool
/ should
.
{
"bool" : {
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
behaves differently than
{
"bool" : {
"must" : [ ... ],
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
]
}
}
because the must
impacts the required nature of should
. Without must
, should
behaves like a boolean OR
. However, with must
, it behaves as a completely optional function to improve relevancy (score). To make it go back to the boolean OR
behavior with must
, you must add minimum_should_match
to the bool
compound query.
{
"bool" : {
"must" : [ ... ],
"should" : [
{
"prefix" : {
"myColumn" : "This"
}
},
{
"prefix" : {
"myColumn" : "orThis"
}
},
{
"prefix" : {
"myColumn" : "orEvenThis"
}
}
],
"minimum_should_match" : 1
}
}
Notice that it's a component of the bool
query, and not of either should
or must
!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With