Is an index not similar to a dictionary? If you have the key, you can immediately access it?
Apparently indexes are sometimes stored as B-Trees... why is that?
Logarithmic time complexity log(n): Represented in Big O notation as O(log n), when an algorithm has O(log n) running time, it means that as the input size grows, the number of operations grows very slowly. Example: binary search.
It depends on the constants involved. For sufficiently large inputs the logarithmic time algorithm will be faster than the linear time one, but that might not matter if your inputs tend to be small and speed really matters.
O(log N) basically means time goes up linearly while the n goes up exponentially. So if it takes 1 second to compute 10 elements, it will take 2 seconds to compute 100 elements, 3 seconds to compute 1000 elements, and so on. It is O(log n) when we do divide and conquer type of algorithms e.g binary search.
In Computer Science, it's often base 2. This is because many divide and conquer algorithms that exhibit this kind of complexity are dividing the problem in two at each step.
Dictionaries are not implicitly sorted, B-Tree
s are.
A B-Tree
index can be used for ranged access, like this:
WHERE col1 BETWEEN value1 AND value2
or ordering, like this:
ORDER BY col1
You cannot immediately access a page in a B-Tree
index: you need to traverse the child pages whose number increases logarithmically.
Some databases support dictionary-type indexes as well, namely, HASH
indexes, in which case the search time is constant. But such indexes cannot be used for ranged access or ordering.
Database Indices are usually (almost always) stored as B-Trees. And all balanced tree structures have O(log n) complexity for searching.
'Dictionary' is an 'Abstract Data Type' (ADT), ie it is a functional description that does not designate an implementation. Some dictionaries could use a Hashtable for O(1) lookup, others could use a tree and achieve O(log n).
The main reason a DB uses B-trees (over any other kind of tree) is that B-trees are self-balancing and are very 'shallow' (requiring little disk I/O)
One of the only data structures you can access immediately with a key is a vector, which for a massive amount of data, becomes inefficient when you start inserting and removing elements. It also needs contiguous memory allocation.
A hash can be efficient but needs more space and will end up with collisions.
A B tree has a good balance between search performance and space.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With