Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch store field vs _source

Using Elasticsearch 1.4.3

I'm building a sort of "reporting" system. And the client can pick and chose which fields they want returned in their result.

In 90% of the cases the client will never pick all the fields, so I figured I can disable _source field in my mapping to save space. But then I learned that

GET myIndex/myType/_search/
{
    "fields": ["field1", "field2"]
    ...
}

Does not return the fields.

So I assume I have to then use "store": true for each field. From what I read this will be faster for searches, but I guess space wise it will be the same as _source or we still save space?

like image 882
user432024 Avatar asked Feb 23 '15 16:02

user432024


People also ask

What is _source field in Elasticsearch?

The _source field contains the original JSON document body that was passed at index time. The _source field itself is not indexed (and thus is not searchable), but it is stored so that it can be returned when executing fetch requests, like get or search.

What are stored fields in Elasticsearch?

For consistency, stored fields are always returned as an array because there is no way of knowing if the original field value was a single value, multiple values, or an empty array. If you need the original value, you should retrieve it from the _source field instead.

How do I capture a specific field in Elasticsearch?

There are two recommended methods to retrieve selected fields from a search query: Use the fields option to extract the values of fields present in the index mapping. Use the _source option if you need to access the original data that was passed at index time.

What type of data does Elasticsearch store?

Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).


2 Answers

The _source field stores the JSON you send to Elasticsearch and you can choose to only return certain fields if needed, which is perfect for your use case. I have never heard that the stored fields will be faster for searches. The _source field could be bigger on disk space, but if you have to store every field there is no need to use stored fields over the _source field. If you do disable the source field it will mean:

  • You won’t be able to do partial updates
  • You won’t be able to re-index your data from the JSON in your Elasticsearch cluster, you’ll have to re-index from the data source (which is usually a lot slower).
like image 188
Dan Tuffery Avatar answered Sep 18 '22 09:09

Dan Tuffery


By default in elasticsearch, the _source (the document one indexed) is stored. This means when you search, you can get the actual document source back. Moreover, elasticsearch will automatically extract fields/objects from the _source and return them if you explicitly ask for it (as well as possibly use it in other components, like highlighting).

You can specify that a specific field is also stored. This means that the data for that field will be stored on its own. Meaning that if you ask for field1 (which is stored), elasticsearch will identify that its stored, and load it from the index instead of getting it from the _source (assuming _source is enabled).

When do you want to enable storing specific fields? Most times, you don't. Fetching the _source is fast and extracting it is fast as well. If you have very large documents, where the cost of storing the _source, or the cost of parsing the _source is high, you can explicitly map some fields to be stored instead.

Note, there is a cost of retrieving each stored field. So, for example, if you have a json with 10 fields with reasonable size, and you map all of them as stored, and ask for all of them, this means loading each one (more disk seeks), compared to just loading the _source (which is one field, possibly compressed).

I got this answer on below link answered by shay.banon you can read this whole thread to get good understanding about it. enter link description here

like image 32
Sudhanshu Gaur Avatar answered Sep 18 '22 09:09

Sudhanshu Gaur