Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Time Complexity Calculation

I'm currently working a few exam question, and got stuck at this point. I am given that a Quicksort algorithm has a time complexity of O(nlog(n)). For a particular input size, the time to sort the list is 4 minutes. The question proceeds to ask how long will it take to sort a list of twice the size, on the same system.

I have already ruled out the the time isn't 8 minutes (twice the input size = twice the duration, very very wrong reasoning).

Some working I did:

Working A

  • 4 = nlog(n)
  • 4 = log(n^n)
  • 16 = n^n
  • I got basically stuck at this point.

Working B

  • X = 2nlog(2n) >> 2n since double the input
  • X = 2n(1 + log(n))
  • X = 2n + 2nlog(n) >> nlog(n) was 4 minutes
  • X = 2n + 2(4) = 2n + 8
  • I once again got stuck at this point.
like image 292
Xandru Mifsud Avatar asked Jan 31 '16 17:01

Xandru Mifsud


1 Answers

I think the first thing to notice about this problem is that, given it took 4 minutes to sort the numbers, n must be quite large. For instance, I just used quicksort to sort a billion numbers on my computer, and it took just under 3 minutes. So n is probably approximately 1 billion (give or take an order of magnitude).

Given that n is huge, it is likely reasonable to approximate this runtime as c*n*lg(n) for some constant c, since the lower-order terms of the runtime expansion shouldn't be too relevant for such a large n. If we double n, we get the following multiplier of runtime compared to the original runtime:

[Runtime(2n)] / [Runtime(n)]
[c * (2n) * lg(2n)] / [c * n * lg(n)]
2 * lg(2n) / lg(n)
2 * log_n(2n)
2 * (1 + log_n(2))

Here, lg() is the logarithm under an arbitrary base and log_n() is the log base n.

First, since we assumed n is large, one possible way to proceed would be to approximate log_n(2) as 0, so the runtime multiplier would be approximated as 2 and the total runtime would be approximated as 8 minutes.

Alternately, since we probably know n to within an order of magnitude, another possibility would be to approximate the multiplier for a likely value of n:

  • If n = 100 million, then we would approximate the multiplier as 2.075 and the total runtime as 8.30 minutes.
  • If n = 1 billion, then we would approximate the multiplier as 2.067 and the total runtime as 8.27 minutes.
  • If n = 10 billion, then we would approximate the multiplier as 2.060 and the total runtime as 8.24 minutes.

Note here that huge changes in our approximation of n yield pretty small changes in the approximation of the total runtime.

It's worth noting that, while this looks nice on paper, in practice architectural considerations can cause real-life runtimes to be much different from the ones we've calculated here. For instance, if the algorithm induces a bunch of paging after doubling the data size, the runtime could be much, much higher than the ~8 minutes we've approximated here.

like image 161
josliber Avatar answered Oct 31 '22 03:10

josliber