I am looking at skip list implementation in Java , and I am wondering the purpose of the following method:
public static int randomLevel() {
int lvl = (int)(Math.log(1.-Math.random())/Math.log(1.-P));
return Math.min(lvl, MAX_LEVEL);
}
And what the difference between the above method and
Random.nextInt(6);
Can anyone explain that? Thanks.
Random.nextInt
should provide a random variable whose probability distribution is (approximately) a discrete uniform distribution over the interval [0, 6).
You can learn more about this here.
http://puu.sh/XMwn
Note that internally Random
uses a linear congruential generator where m = 2^48, a = 25214903917, and c = 11.
randomLevel
instead (approximately) uses a geometric distribution where p = 0.5. You can learn more about the distribution here.
http://puu.sh/XMwT
Essentially, randomLevel
returns 0
with probability 0.5, 1
with 0.25, 2
with 0.125, etc. until 6
with 0.5^7 i.e. *0.0078125** -- far different than the ~0.14 from Random.nextInt
.
Now the importance of this is that a skip list is an inherently probabilistic data structure. By utilizing multiple sparse levels of linked lists, they can achieve average runtime performance of O(log n) search -- similar to a balanced binary search tree, but less complex and using less space. Using a uniform distribution here would not be appropriate, seeing how to as higher levels are less densely populated in comparison to lower ones (note: below, the levels grow downward) -- which is necessary for the fast searches.
Just like the link says...
"This gives us a 50% chance of the random_level() function returning 0, a 25% chance of returning 1, a 12.5% chance of returning 2 and so on..." The distribution is therefore not even. However, Random.nextInt() is. There is an equal likelihood that any number between 0 and 5 will be selected.
I haven't looked at the full implementation, but what probably happens is that randomLevel() us used to select a number, say n. Then, the element that needs to be added to the skiplist will have pointers 0, 1,...,n. You can think of each level as a separate list.
Why use a distribution like this? Well an even distribution will require too much memory for the benefit that it will have. By reducing the chance using a geometric distribution, the "sweet" spot is attained. Now the advantage of obtaining a value quickly, with a smaller memory footprint is realised.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With