ElasticSearch not returning results for terms query against string property

Tags:

elasticsearch

I have the following indexed document:

{     "visitor": {         "id": <SOME STRING VALUE>     } }

The mapping for the document is:

"visitor": {     "properties": {         "id": {             "type": "string"          }      }  }

When I run the following query I get results:

{     "query": {         "filtered": {             "query": {                 "match_all": {}              }         },         "filter": {             "term": { "visitor.id": "123" }         }     } }

However this does not:

{     "query": {         "filtered": {             "query": {                 "match_all": {}              }         },         "filter": {             "term": { "visitor.id": "ABC" }         }     } }

I've been thinking this is related to analyzers and have been chasing that down. I've also been wondering if I was wrong to use dot notation to get to the nested visitor property.

Can anyone tell me why I can't filter for the visitor with the id of "ABC" but can for visitor 123

260

asked Feb 21 '14 11:02

1 Answers

You need to understand how elasticsearch's analyzers work. Analyzers perform a tokenization (split an input into a bunch of tokens, such as on whitespace), and a set of token filters (filter out tokens you don't want, like stop words, or modify tokens, like the lowercase token filter which converts everything to lower case).

Analysis is performed at two very specific times - during indexing (when you put stuff into elasticsearch) and, depending on your query, during searching (on the string you're searching for).

That said, the default analyzer is the standard analyzer which consists of a standard tokenizer, standard token filter (to clean up tokens from the standard tokenizer), lowercase token filter, and stop words token filter.

To put this to an example, when you save the string "I love Vincent's pie!" into elasticsearch, and you're using the default standard analyzer, you're actually storing "i", "love", "vincent", "s", "pie". Then, when you attempt to search for "Vincent's" with a term query (which is not analyzed), you will not find anything because "Vincent's" is not one of those tokens! However, if you search for "Vincent's" using a match query (which is analyzed), you will find "I love Vincent's pie!" because "vincent" and "s" both find matches.

The bottom line, either:

Use an analyzed query, such as match, when searching natural language strings.
Set up the analyzers to match your needs. You could set up set up a custom analyzer that performs a whitespace tokenizer or a letter tokenizer or a pattern tokenizer if you want to get complicated, as well as whatever filters your heart desires. It depends on your use case, but if you're dealing with natural language sentences I don't recommend this because the standard tokenizer was built for natural language searching.

You can set the field up to not use an analyzer with the following mapping, which should suit your needs:

"visitor": {     "properties": {         "id": {             "type": "string"             "index": "not_analyzed"         }     } }

See http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis.html for further reading.

answered Sep 24 '22 01:09

Andrew Macheret

Related questions
                            
                                Error: index_not_found_exception
                            
                                Is there a way to exclude a field in an Elasticsearch query
                            
                                How to Do a Mapping of Array of Strings in Elasticsearch
                            
                                what is the difference between _source and _all in Elasticsearch
                            
                                Elasticsearch Dynamic Scripting Disabled
                            
                                Update max_map_count for ElasticSearch docker container Mac host
                            
                                TransportError(403, u'cluster_block_exception', u'blocked by: [FORBIDDEN/12/index read-only / allow delete (api)];')
                            
                                Elasticsearch - How to normalize score when combining regular query and function_score?
                            
                                ElasticSearch get offsets of highlighted snippets
                            
                                How to get auto increment id for elasticsearch
                            
                                Remove duplicate documents from a search in Elasticsearch
                            
                                How to setup ElasticSearch cluster with auto-scaling on Amazon EC2?
                            
                                Elasticsearch 6: Rejecting mapping update as the final mapping would have more than 1 type
                            
                                List all fields in an elasticsearch index?
                            
                                Configure port number of ElasticSearch
                            
                                Elasticsearch URI based query with AND operator
                            
                                Multi-"match-phrase" query in Elastic Search
                            
                                Timestamp not appearing in Kibana
                            
                                elasticsearch / kibana errors "Data too large, data for [@timestamp] would be larger than limit
                            
                                what does _doc represents in elasticsearch?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

ElasticSearch not returning results for terms query against string property

Tags:

elasticsearch

goatshepard

People also ask

1 Answers

Andrew Macheret

Recent Activity

Donate For Us