What characters does the standard tokenizer delimit on?

Question

I was wondering which characters are used to delimit a string for elastic search's standard tokenizer?

Andrei Stefan · Accepted Answer

As per the documentation I believe this is the list of symbols/characters used for defining tokens: http://unicode.org/reports/tr29/#Default_Word_Boundaries

What characters does the standard tokenizer delimit on?

Tags:

delimiter

elasticsearch

tokenize

David Carek

1 Answers

Andrei Stefan

Recent Activity

Donate For Us

What characters does the standard tokenizer delimit on?

Tags:

delimiter

elasticsearch

tokenize

David Carek

1 Answers

Andrei Stefan

Related questions

Recent Activity

Donate For Us