Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Random level function in skip list

Tags:

java

I am looking at skip list implementation in Java , and I am wondering the purpose of the following method:

public static int randomLevel() {
    int lvl = (int)(Math.log(1.-Math.random())/Math.log(1.-P));
    return Math.min(lvl, MAX_LEVEL);
}

And what the difference between the above method and

Random.nextInt(6);

Can anyone explain that? Thanks.

like image 508
Foredoomed Avatar asked Jan 15 '23 18:01

Foredoomed


2 Answers

Random.nextInt should provide a random variable whose probability distribution is (approximately) a discrete uniform distribution over the interval [0, 6). You can learn more about this here. http://puu.sh/XMwn

Note that internally Random uses a linear congruential generator where m = 2^48, a = 25214903917, and c = 11.


randomLevel instead (approximately) uses a geometric distribution where p = 0.5. You can learn more about the distribution here.

http://puu.sh/XMwT

Essentially, randomLevel returns 0 with probability 0.5, 1 with 0.25, 2 with 0.125, etc. until 6 with 0.5^7 i.e. *0.0078125** -- far different than the ~0.14 from Random.nextInt.


Now the importance of this is that a skip list is an inherently probabilistic data structure. By utilizing multiple sparse levels of linked lists, they can achieve average runtime performance of O(log n) search -- similar to a balanced binary search tree, but less complex and using less space. Using a uniform distribution here would not be appropriate, seeing how to as higher levels are less densely populated in comparison to lower ones (note: below, the levels grow downward) -- which is necessary for the fast searches.

like image 121
obataku Avatar answered Jan 25 '23 04:01

obataku


Just like the link says...

"This gives us a 50% chance of the random_level() function returning 0, a 25% chance of returning 1, a 12.5% chance of returning 2 and so on..." The distribution is therefore not even. However, Random.nextInt() is. There is an equal likelihood that any number between 0 and 5 will be selected.

I haven't looked at the full implementation, but what probably happens is that randomLevel() us used to select a number, say n. Then, the element that needs to be added to the skiplist will have pointers 0, 1,...,n. You can think of each level as a separate list.

Why use a distribution like this? Well an even distribution will require too much memory for the benefit that it will have. By reducing the chance using a geometric distribution, the "sweet" spot is attained. Now the advantage of obtaining a value quickly, with a smaller memory footprint is realised.

like image 24
Jaco Van Niekerk Avatar answered Jan 25 '23 05:01

Jaco Van Niekerk