What's the difference between with_positions
and with_positions_offsets
with regard to the term_vector
in Elasticsearch?
with_positions
will instruct Elasticsearch (Lucene) to, also, index the position of the terms in the original textThis helps with the order of terms inside a field, for example for the other question you had here.
with_positions_offsets
will instruct Elasticsearch (Lucene) to, also, index the position of the terms in the original text and the character offset information about each term (the start position and end position for each term, at the character level)This helps in cases when the original text that was indexed was "changed" during the analysis phase, for example, by changing terms in it by their synonyms. These synonyms can have multiple words, thus changing the whole "structure" (positions and offsets) of the original text.
In both cases the size of the index on disk will increase. The increase will be smaller for with_positions
than with_positions_offsets
.
term_vector
options are a must for highlighting!
A great example demonstrating positions
and offsets
can be found on this Lucene and Elasticsearch committer blog post:
For this sample text, this is how the list of positions for each term and the character offsets look like:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With