Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Intuition behind splay tree (self balancing trees)

I am reading the basics of splay trees. The amortized cost of an operation is O(log n) over n operations. Some rough basic idea is that when you access a node, you splay it i.e. you take it to root so next time this is quickly accessed and also if the node was deep, it enhances balance-ness of tree.

I don't understand how the tree can perform amortized O(log n) for this sample input:

Say a tree of n nodes is already built. My next n operations are n reads. I access a deep node say at depth n. This takes O(n). True that after this access, the tree will become balanced. But say every time I access the most current deep node. This will never be less than O(log n). then how we can ever compensate for the first costly O(n) operation and bring the amortized cost of each read as O(log n)?

Thanks.

like image 498
xyz Avatar asked May 02 '11 06:05

xyz


People also ask

Is splay tree a self-balancing tree?

Like AVL and Red-Black Trees, Splay tree is also self-balancing BST. The main idea of splay tree is to bring the recently accessed item to root of the tree, this makes the recently searched item to be accessible in O(1) time if accessed again.

What is the concept behind splay trees?

A splay tree is a binary search tree with the additional property that recently accessed elements are quick to access again. Like self-balancing binary search trees, a splay tree performs basic operations such as insertion, look-up and removal in O(log n) amortized time.

Why do we prefer splay trees?

Why to prefer splay trees? Explanation: Whenever you insert an element or remove or read an element that will be pushed or stored at the top which facilitates easier access or recently used stuff.

What are the applications of splay trees?

A splay tree is an efficient implementation of a balanced binary search tree that takes advantage of locality in the keys used in incoming lookup requests. For many applications, there is excellent key locality. A good example is a network router.


1 Answers

Assuming your analysis is correct and the operations are O(log(n)) per access and O(n) the first time...

If you always access the bottommost element (using some kind of worst-case oracle), a sequence of a accesses will take O(a*log(n) + n). And thus the amortized cost per operation is O((a*log(n) + n)/a)=O(log(n) + n/a) or just O(log(n)) as the number of accesses grows large.

This is the definition of asymptotic average-case performance/time/space, also called "amortized performance/time/space". You are accidentally thinking that a single O(n) step means all steps are at least O(n); one such step is only a constant amount of work in the long run; the O(...) is hiding what's really going on, which is taking the limit of [total amount of work]/[queries]=[average ("amortized") work per query].

This will never be less than O(log n).

It has to be in order to get O(log n) average performance. To get intuition, the following website may be good: http://users.informatik.uni-halle.de/~jopsi/dinf504/chap4.shtml specifically the image http://users.informatik.uni-halle.de/~jopsi/dinf504/splay_example.gif -- it seems that while performing the O(n) operations, you move the path you searched scrunching it towards the top of the tree. You probably only have a finite number of such O(n) operations to perform until the entire tree is balanced.

Here's another way to think about it:

Consider an unbalanced binary search tree. You can spend O(n) time balancing it. Assuming you don't add elements to it*, it takes O(log(n)) amortized time per query to fetch an element. The balancing setup cost is included in the amortized cost because it is effectively a constant which, as demonstrated in the equations in the answer, disappears (is dwarfed) by the infinite amount of work you are doing. (*if you do add elements to it, you need a self-balancing binary search tree, one of which is a splay tree)

like image 119
ninjagecko Avatar answered Oct 23 '22 03:10

ninjagecko