I'm looking for a efficient way to store many key->value pairs on disc for persistence, preferably with some caching.
The features needed are to either add to the value (concatenate) for a given key or to let the model be key -> list of values, both options are fine. The value-part is typically a binary document.
I will not have too much use of clustering, redundancy etc in this scenario.
Language-wise we're using java and we are experienced in classic databases (Oracle, MySQL and more).
I see a couple of obvious scenarios and would like advice on what is fastest in terms of stores (and retrievals) per second:
1) Store the data in classic db-tables by standard inserts.
2) Do it yourself using a file system tree to spread to many files, one or several per key.
3) Use some well known tuple-storage. Some obvious candidates are: 3a) Berkeley db java edition 3b) Modern NoSQL-solutions like cassandra and similar
Personally I like the Berkely DB JE for my task.
To summarize my questions:
Does Berkely seem like a sensible choice given the above?
What kind of speed can I expect for some operations, like updates (insert, addition of new value for a key) and retrievals given key?
You could also give a try to Chronicle Map or JetBrains Xodus which are both Java embeddable key-value stores much faster than Berkeley DB JE (if you are really looking for speed). Chronicle Map provides an easy-to-use java.util.Map
interface.
BerkeleyDB sounds sensible. Cassandra would also be sensible but perhaps is overkill if you don't need redundancy, clustering etc.
That said, a single Cassandra node can handle 20k writes per second (provided that you use multiple clients to exploit the high concurrency within Cassandra) on relatively modest hardware.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With