Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Elasticsearch - Maintaining Document History

I am new to Elasticsearch with a very basic question to ask;

I am planning to use Elasticsearch as a document store and with storing documents, one of the requirements I have is to maintain historical data as well.

So I can post documents to Elasticsearch successfully, but when I post an updated version of the same document - as I've seen - original copy is overwritten. What I need is to have Elasticsearch keep older copies stored as well which I should be able to access via specifying a version number.

I have looked at its native support for document versioning which works great for concurrency control but doesn't look like it keeps a history of previous versions and only the latest version is available.

Could someone guide me in the right direction here please.

like image 213
Jay Avatar asked May 01 '14 16:05

Jay


People also ask

How are documents stored in elastic search?

Elasticsearch stores data as JSON documents. Each document correlates a set of keys (names of fields or properties) with their corresponding values (strings, numbers, Booleans, dates, arrays of values, geolocations, or other types of data).

Can Elasticsearch store files?

Elasticsearch is a distributed document store. Instead of storing information as rows of columnar data, Elasticsearch stores complex data structures that have been serialized as JSON documents.

How many documents can Elasticsearch hold?

You could have one document per product or one document per order. There is no limit to how many documents you can store in a particular index.

Can Elasticsearch store unstructured data?

Elasticsearch engine is great option for storing unstructured data to search with Haystack. Developers love Elasticsearch (ES) for its ease of use, scalability, and the speed with which it returns keyword-based search results, even from large datasets.


1 Answers

As stated in here ES do not store older versions.

Note that Elasticsearch do not store older versions of documents. Only the current version can be retrieved.

You should store the history in a separate index. And insert into the history index on every update to the original document index.

like image 138
Volkan Vardar Avatar answered Sep 19 '22 01:09

Volkan Vardar