Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to set a custom analyzer to not tokenize in elasticsearch?

I want to treat the field of one of the indexed items as one big string even though it might have whitespace. I know how to do this by setting a non-custom field to be 'not-analyzed', but what tokenizer can you use via a custom analyzer?

The only tokenizer items I see on elasticsearch.org are:

  • Edge
  • NGram
  • Keyword
  • Letter
  • Lowercase
  • NGram
  • Standard
  • Whitespace
  • Pattern
  • UAX URL Email
  • Path
  • Hierarchy

None of these do what I want.

like image 684
perseverance Avatar asked Nov 05 '12 22:11

perseverance


People also ask

What is difference between analyzer and tokenizer in Elasticsearch?

Elasticsearch analyzers and normalizers are used to convert text into tokens that can be searched. Analyzers use a tokenizer to produce one or more tokens per text field. Normalizers use only character filters and token filters to produce a single token.

What is the default analyzer Elasticsearch?

By default, Elasticsearch uses the standard analyzer for all text analysis. The standard analyzer gives you out-of-the-box support for most natural languages and use cases. If you chose to use the standard analyzer as-is, no further configuration is needed.

What is whitespace tokenizer in Elasticsearch?

The whitespace tokenizer breaks text into terms whenever it encounters a whitespace character.


1 Answers

The Keyword tokenizer is what you are looking for.

like image 118
imotov Avatar answered Oct 14 '22 12:10

imotov