Let's say I have a field f1 with value str (String).
if index="not_analyzed" is set for field f1, then value str is stored (as an entry) into the inverted index, because it's not splitted into tokens.
if I set store="yes" too, then (IMHO), the value str is going to be recorded another one, so twice in the end.
So, first question, if index="not_analyzed", what's the added value of store="yes"?
To say things differently, as this second storage (due to store="yes") has a cost, what new capabilities are brought by store="yes" that are not possible otherwise, even if the str is already stored into the inverted index?
IMHO, the same case could appear if the field is not a String, but a date (or a long or an integer).
Let's say we have a field f2 like "type": "date" and "precision_step" : "0".
if I set store="yes" too, then (IMHO), the value of f2 is going to be recorded twice in the end.
So, second question, if "precision_step"="0", what's the added value of store="yes"?
To say things differently, as this second storage induced by store="yes" has a cost, what new capabilities are brought by store="yes"?
PS: it's related to this question In fact, this new question is all about questioning what has been NOT answered into this previous one.
I think you are, just like the SO post referenced, mixing things up. Indexing and storing are two different things. You index tokens (either analyzed or not) to be able to search afterwards in a very very quick manner.
It has nothing to do with storing the actual field value in the inverted index. These are separate operations and separate things. If you don't store a field and you disable "_source" field then you are using ES just for searching, even if you have the inverted index with the field value in there. If you actually need to retrieve the original document (or the original value in a field that matched) then you can't. You either need _source or "store": yes.
Basically, if _source is disabled and "store" is no then you are only able to get back only the _id of the document. If you need the content then you go to your secondary storage option (DB database, for example) and, based on the _id, you get the actual content. It doesn't matter if your field is "analyzed" or "not_analyzed", as long as you have no other way of storing the original document or the original field value, then ES is used for searching only (as opposed to store and search).
Everything below is to be considered when "_source" is disabled. Regarding the store=yes capabilities, one is above - being able to retrieve the content, not only searching. The second is highlighting. Without storing, one cannot highlight tokens in a field.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With