Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Balanced binary trees versus indexed skiplists

Not sure if the question should be here or on programmers (or some other SE site), but I was curious about the relevant differences between balanced binary trees and indexable skiplists. The issue came up in the context of this question. From the wikipedia:

Skip lists are a probabilistic data structure that seem likely to supplant balanced trees as the implementation method of choice for many applications. Skip list algorithms have the same asymptotic expected time bounds as balanced trees and are simpler, faster and use less space.

Don't the space requirements of a skiplist depend on the depth of the hierarchy? And aren't binary trees easier to use, at least for searching (granted, insertion and deletion in balanced BSTs can be tricky)? Are there other advantages/disadvantages to skiplists?

like image 394
angelatlarge Avatar asked Apr 01 '13 17:04

angelatlarge


People also ask

What is the difference between a tree a binary tree and a balanced binary tree?

B-Tree : B-Tree is known as a self-balancing tree as its nodes are sorted in the inorder traversal. Unlike the binary trees, in B-tree, a node can have more than two children. B-tree has a height of logM N (Where 'M' is the order of tree and N is the number of nodes).

Why is it better when a binary search tree is balanced?

Balancing the tree makes for better search times O(log(n)) as opposed to O(n). Save this answer. Show activity on this post. As we know that most of the operations on Binary Search Trees proportional to height of the Tree, So it is desirable to keep height small.

What is the difference between balanced and un balanced trees?

A balanced binary tree is one in which no leaf nodes are 'too far' from the root. For example, one definition of balanced could require that all leaf nodes have a depth that differ by at most 1. An unbalanced binary tree is one that is not balanced.

What is balanced binary tree?

A balanced binary tree, also referred to as a height-balanced binary tree, is defined as a binary tree in which the height of the left and right subtree of any node differ by not more than 1.


1 Answers

(Some parts of your question (ease of use, simplicity, etc.) are a bit subjective and I'll answer them at the end of this post.)

Let's look at space usage. First, let's suppose that you have a binary search tree with n nodes. What's the total space usage required? Well, each node stores some data plus two pointers. You might also need some amount of information to maintain balance information. This means that the total space usage is

n * (2 * sizeof(pointer) + sizeof(data) + sizeof(balance information))

So let's think about an equivalent skiplist. You are absolutely right that the real amount of memory used by a skiplist depends on the heights of the nodes, but we can talk about the expected amount of space used by a skiplist. Typically, you pick the height of a node in a skiplist by starting at 1, then repeatedly flipping a fair coin, incrementing the height as long as you flip heads and stopping as soon as you flip tails. Given this setup, what is the expected number of pointers inside a skiplist?

An interesting result from probability theory is that if you have a series of independent events with probability p, you need approximately 1 / p trials (on expectation) before that event will occur. In our coin-flipping example, we're flipping a coin until it comes up tails, and since the coin is a fair coin (comes up heads with probability 50%), the expected number of trials necessary before we flip tails is 2. Since that last flip ends the growth, the expected number of times a node grows in a skiplist is 1. Therefore, on expectation, we would expect an average node to have only two pointers in it - one initial pointer and one added pointer. This means that the expected total space usage is

n * (2 * sizeof(pointer) + sizeof(data))

Compare this to the size of a node in a balanced binary search tree. If there is a nonzero amount of space required to store balance information, the skiplist will indeed use (on expectation) less memory than the balanced BST. Note that many types of balanced BSTs (e.g. treaps) require a lot of balance information, while others (red/black trees, AVL trees) have balance information but can hide that information in the low-order bits of its pointers, while others (splay trees) don't have any balance information at all. Therefore, this isn't a guaranteed win, but in many cases it will use space.

As to your other questions about simplicity, ease, etc: that really depends. I personally find the code to look up an element in a BST far easier than the code to do lookups in a skiplist. However, the rotation logic in balanced BSTs is often substantially more complicated than the insertion/deletion logic in a skiplist; try seeing if you can rattle off all possible rotation cases in a red/black tree without consulting a reference, or see if you can remember all the zig/zag versus zag/zag cases from a splay tree. In that sense, it can be a bit easier to memorize the logic for inserting or deleting from a skiplist.

Hope this helps!

like image 137
templatetypedef Avatar answered Oct 06 '22 10:10

templatetypedef