I've been reading a bit lately on document-based databases vs. key-value stores (Here's a good overview Difference between Document-based and Key/Value-based databases? ) and I'm having trouble finding good info on the following.
If we query either of these with the key (or an additional index), there's no real difference in the mechanics - get the value. I'm not clear on how a document store is that different from a key-value store when querying non-indexed documents/fields. If I were to implement a document store on top of a key-value store, I'd do a 'table scan' (check all key/value pairs) for the appropriate value in the query - do document stores do more than this under the covers? Is it appropriate to think of document data stores in this fashion?
This is less of a practical question (would I use Mongo over a BDB if I needed to do something useful, most likely) than one aimed at understanding the underlying technology. I'm interested in the scaling aspects of particular systems only if they are applicable to the underlying implementation.
MongoDB and CouchDB use standard JSON (or BSON (spec)) to store data. They have optimized algorithms when you are querying for a particular value of an object and as far as my knowledge goes, they use Binary Trees for optimization with indexes (MongoDB certainly does). Using these, they can locate the data incomparably faster than searching in the values in a key-value pair database.
(From the key-value pair database implementations, Redis has a very interesting way of increasing performance where it stores the data on memory with few disk I/O.)
Edit:
Came by a great video in which the internals of the MongoDB is explained. Check it out.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With