I was reading about skip lists and MemSQL and was wondering why skip lists are not more widely used in databases? Are there any major disadvatages to using skiplists?
Finally, you shouldn't store credit card information in your database unless you absolutely need to. This includes credit card owner names, numbers, CVV numbers, and expiration dates.
A data lake is a large storage repository that holds a huge amount of raw data in its original format until you need it.
If you had no concerns about storage costs (managed MongoDB Atlas is a bit pricey), MongoDB is a clear winner for storing end-of-day OHLC data.
A non-relational database is a database that does not use the tabular schema of rows and columns found in most traditional database systems. Instead, non-relational databases use a storage model that is optimized for the specific requirements of the type of data being stored.
Databases are typically so huge that they have to be stored in external memory, such as a giant disk drive. As a result, the bottleneck in most database applications is the number of times that we have to do a memory transfer from the disk drive into main memory.
B-trees and their variants are specifically designed to minimize the number of block reads and writes necessary to perform each of their operations. Mathematically, the number of memory transfers required for each B-tree operation is O(log n / log B), where B is the block size. Compare this to a skiplist, which requires O(log n) memory transfers on expectation. Since B is usually measured in megabytes, log B can be in the neighborhood of 15 - 25, so the B-tree can be significantly faster. Even when the database is in main memory, the effect of the memory hierarchy (L1 and L2 caches, etc.) can be so pronounced that B-tree variants are still faster in practice than many other data structures. This Google blog post gives some background on this.
Although each operation on a B-tree typically requires more CPU work than corresponding operations in other data structures, the fact that they require so few memory transfers tends to make them significantly faster in practice than other data structures. Therefore, it would not be advisable to use a skip list in a database.
There's another reason B-trees are nice: they're worst-case efficient. Although deterministic skip lists do exist, most skiplist implementations are randomized and give expected guarantees on their behavior. In a database, this might be unacceptable because many use cases on databases require worst-case efficient behavior.
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With