I thought that regardless of whether a NoSQL aggregate store is a key-value, column-family or document database, it would support versioning of values. After a bit of Googling, I'm concluding that this assumption is wrong and that it just depends on the DBMS implementation. Is this true?
I know that Cassandra and BigTable support it (both column-family stores). It SEEMS that Hbase (column family) and Riak (Key-Value) do but Redis and Hadoop (Key-Value) do not. Mongo DB (document) doesCouchbase does but MongoDB does not (document stores). I don't see any pattern here. Is there a rule of thumb? (for example, "key value stores generally do not have versioning, while column-family and document databases do")
What I'm trying to do: I want to create a database of website screenshots from URL to PNG image. I'd rather use a key-value store since, versioning aside, it is the simplest solution that satisfies the problem. But when website changes or is decomissioned and I update my database I don't want to lose old images. Even if I select a key-value database that has versioning, I want to have the luxury to switch to a different key-value database without the constraint that many key-value DBs do not support versioning. So I'm trying to understand at what level of sophistication in the continuum of aggregate NoSQL databases does versioning become a feature implicit to the data model.
Types of NoSQL Database:Document-based databases. Key-value stores. Column-oriented databases. Graph-based databases.
NoSQL Databases are mainly categorized into four types: Key-value pair, Column-oriented, Graph-based and Document-oriented.
You don't really need versioning support from the Key-Value store.
The only thing you really need from the data Store is an efficient scanning/range query feature.
This means the datastore can retrieve entries in lexicographical order.
Most KV-stores do, so this is easy.
This is how you do it:
Create versioned keys.
In case you cant hash the original name to a fixed length, prepend the length of the original key. then put in the hash of the key or the original key itself, and end with a fixed length encoded version number (so it is lexicographically ordered from high version to low by inverting the number against the max version).
Query
Do a range query from the maximum possible version up to version 0, but only retrieving exactly one key.
Done
If you dont need explicit versions, you can also use a timestamp, so you can insert without getting the last version.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With