Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are skip lists not preferred over B+-trees for databases?

I was reading about skip lists and MemSQL and was wondering why skip lists are not more widely used in databases? Are there any major disadvatages to using skiplists?

like image 774
lucidxistence Avatar asked Feb 17 '14 12:02

lucidxistence


People also ask

What type of data you should not store in the database?

Finally, you shouldn't store credit card information in your database unless you absolutely need to. This includes credit card owner names, numbers, CVV numbers, and expiration dates.

Which storage method is most appropriate for data in its raw format?

A data lake is a large storage repository that holds a huge amount of raw data in its original format until you need it.

Which database is most suitable for representing financial details?

If you had no concerns about storage costs (managed MongoDB Atlas is a bit pricey), MongoDB is a clear winner for storing end-of-day OHLC data.

What is nonrelational database?

A non-relational database is a database that does not use the tabular schema of rows and columns found in most traditional database systems. Instead, non-relational databases use a storage model that is optimized for the specific requirements of the type of data being stored.


1 Answers

Databases are typically so huge that they have to be stored in external memory, such as a giant disk drive. As a result, the bottleneck in most database applications is the number of times that we have to do a memory transfer from the disk drive into main memory.

B-trees and their variants are specifically designed to minimize the number of block reads and writes necessary to perform each of their operations. Mathematically, the number of memory transfers required for each B-tree operation is O(log n / log B), where B is the block size. Compare this to a skiplist, which requires O(log n) memory transfers on expectation. Since B is usually measured in megabytes, log B can be in the neighborhood of 15 - 25, so the B-tree can be significantly faster. Even when the database is in main memory, the effect of the memory hierarchy (L1 and L2 caches, etc.) can be so pronounced that B-tree variants are still faster in practice than many other data structures. This Google blog post gives some background on this.

Although each operation on a B-tree typically requires more CPU work than corresponding operations in other data structures, the fact that they require so few memory transfers tends to make them significantly faster in practice than other data structures. Therefore, it would not be advisable to use a skip list in a database.

There's another reason B-trees are nice: they're worst-case efficient. Although deterministic skip lists do exist, most skiplist implementations are randomized and give expected guarantees on their behavior. In a database, this might be unacceptable because many use cases on databases require worst-case efficient behavior.

Hope this helps!

like image 102
templatetypedef Avatar answered Sep 19 '22 15:09

templatetypedef