I implemented in Red-Black trees in Python according to the pseudo code in Cormen's Introduction to Algorithms
.
I wanted to see in my own eyes that my insert
is really O(logn)
so I plotted the time it takes to insert n=1, 10, 20, ..., 5000
nodes into the tree.
This is the result:
the x
-axis is n
and the y
-axis is the time it took in milliseconds.
To me the graph looks more linear than logarithmic. What can explain that?
Ok, so the graph displays a measurement of the cost of inserting n
elements into your tree, where the x axis is how many elements we've inserted, and the y axis is the total time.
Let's call the function that totals the time it takes to insert n elements into the tree f(n)
.
Then we can get a rough idea of what f
might look like:
f(1) < k*log(1) for some constant k.
f(2) < k*log(1) + k*log(2) for some constant k
...
f(n) < k * [log(1) + log(2) + ... + log(n)] for some constant k.
Due to how logs work, we can collapse log(1) + ... + log(n)
:
f(n) < k * [log(1*2*3*...*n)] for some constant k
f(n) < k * log(n!) for some constant k
We can take a look at Wikipedia to see a graph of what log(n!)
looks like. Take a look at the graph in the article. Should look pretty familiar to you. :)
That is, I think you've done this by accident:
for n in (5000, 50000, 500000):
startTime = ...
## .. make a fresh tree
## insert n elements into the tree
stopTime = ...
## record the tuple (n, stopTime - startTime) for plotting
and plotted total time to construct the tree of size n, rather than the individual cost of inserting one element into a tree of size n:
for n in range(50000):
startTime = ...
## insert an element into the tree
stopTime = ...
## record the tuple (n, stopTime - startTime) for plotting
Chris Taylor notes in the comments that if you plot f(n)/n
, you'll see a log graph. That's because a fairly tight approximation to log(n!)
is n*log(n)
(see the Wikipedia page). So we can go back to our bound:
f(n) < k * log(n!) for some constant k
and get:
f(n) < k * n * log(n) for some constant k
And now it's should be easier to see that if you divide f(n)
by n
, your graph will be bounded above by the shape of a logarithm.
5000
might not be large enough to really "see" the logarithm -- try runs at 50000
and 500000
. If it takes two seconds and twenty seconds, then linear growth makes sense. If it takes less, then logarithmic makes sense. If you zoom in closely enough on most "simple" functions, the results look pretty linear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With