Given the following requirements of a persistent key/value store:
And given this usage pattern:
What is the best on-disk data structure/algorithm given the requirements?
Can a custom implementation exceed the performance of LSM-based (Log Structured Merge) implementations (i.e. leveldb, rocksdb)?
Would a high performance custom implementation for these requirements also be considerably simpler in implementation?
While it might be possible to have better performance custom implementation for your requirements, a well-configured RocksDB should be able to beat most of such custom implementations in your case. Here is what I would config RocksDB:
First, since you don't have updates and deletes, it's best to compact all data into big files in RocksDB. Because RocksDB sort these keys in a customizable order, having some big files gives you faster read performance as binary search across some big files is faster than across multiple small files. Typically, having big files hurt the performance of compaction --- the process of reorganizing big part of the data in RocksDB such that 1. multiple updates associated with a single key will be merged; 2. execute deletions to free disk space; 3. keep the data sorted. However, since you don't have updates and deletes, having big files give you fast read and write performance.
Second, specify large bits for bloom filter, this allow you to avoid most IOs when you're likely to issue some queries where keys do not exist in RocksDB.
So a read request goes like this. First, it compares the query key with the key ranges of those big files to identify which file the query key might located. Then, once the file(s) which key-range covers the query key, it will computes the bloom bits of the query key to see if the query key may exist in those files. If so, then a binary search inside each file will be triggered to identify the matched data.
As for scan operation, since RocksDB internally sort data in a user customizable order, scan can be done very efficiently using RocksDB's iterator.
More information about RocksDB basics can be found in here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With