Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elastic - with_positions and with_positions_offsets

What's the difference between with_positions and with_positions_offsets with regard to the term_vector in Elasticsearch?

like image 712
Imran Azad Avatar asked Dec 25 '22 12:12

Imran Azad


1 Answers

  • with_positions will instruct Elasticsearch (Lucene) to, also, index the position of the terms in the original text

This helps with the order of terms inside a field, for example for the other question you had here.

  • with_positions_offsets will instruct Elasticsearch (Lucene) to, also, index the position of the terms in the original text and the character offset information about each term (the start position and end position for each term, at the character level)

This helps in cases when the original text that was indexed was "changed" during the analysis phase, for example, by changing terms in it by their synonyms. These synonyms can have multiple words, thus changing the whole "structure" (positions and offsets) of the original text.


In both cases the size of the index on disk will increase. The increase will be smaller for with_positions than with_positions_offsets.

term_vector options are a must for highlighting!


A great example demonstrating positions and offsets can be found on this Lucene and Elasticsearch committer blog post:

Sample text

For this sample text, this is how the list of positions for each term and the character offsets look like:

enter image description here

like image 125
Andrei Stefan Avatar answered Jan 07 '23 07:01

Andrei Stefan