So I have a field that stores a value in the format: number/year
, something like 23/2014, 24/2014, 12/2015, etc...
so if this field is mapped as a not_analyzed
one, I can make exact value searches with term filter, if I search for a value in that exact structure(something like 1/2014, 15/2014,...) it works, like the sql equals(=)
.
{
"query": {
"filtered": {
"filter": {
"term": {
"processNumber": "11/2014"
}
}
}
}
}
So, searching with something different like 11/, or /2014 wouldn't return hits. This is fine.
But if I define the field as not_analyzed
, I can't make sql LIKE
type searches with the match_phrase
query.
{
"query": {
"match_phrase": {
"processNumber": "11/201"
}
}
}
In this case searching for 11,11/,/2014 or 2014 should return hits, but they don't.
The thing is, this query works if the field is not mapped as a not_analyzed
one. So it seems I have to either use one or the other, the problem is that the field should support both options for different queries, am I missing something here?
You can analyze the same field processNumber in different ways using the fields property in the mapping:
For example if you want the analyzed and unanalyzed version of ProcessNumber the mapping would be :
{
"type_name": {
"properties": {
"processNumber": {
"type": "string",
"index": "not_analyzed",
"fields": {
"analyzed": {
"type": "string",
"index": "analyzed"
}
}
}
}
}
}
Where the not-analyzed field is referred in query as processNumber .
To refer to the analyzed view of the field use processNumber.analyzed
The queries for terms 11/201, 11 etc would be :
Example Filter:
{ "query" : { "filtered" : { "filter" : { "term" : { "processNumber" : "11/2014" } } } } }
Term filter it does not analyze the search string so an input would be matched as it is with the fields inverted index in this case : 11/2014 against the field.
Example Match_Phrase_prefix:
{ "query": { "match_phrase_prefix": { "processNumber": "11/201" } } }
match_phrase_prefix tries to check if the last term in the phrase is a prefix of terms in index . It analyzes the search string if an analyzer is specified. This is the reason you need to use the unanalyzed version of the field here . If we use processNumber.analyzed search queries such as 11-201 , 11|201 would also match
example match :
{ "query": { "match": { "processNumber.analyzed": "11" } } }
This is straight forward match since default analyzer (usually standard analyzer) will tokenize 11/2014 to terms 11, 2014 .
You can use the analyze api to see how a particular text gets analyzed by default analyzer.
curl -XPOST "http://<machine>/_analyze?text=11/2014"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With