Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Automated computation of algorithm time complexity for terminating algorithms

There are a lot of related questions here on SO, but they all ask about writing a program to compute the complexity of arbitrary algorithms (which is obviously undecideable). I am willing to make the following restrictions on the input:

  1. The algorithm terminates
  2. The algorithm is purely functional

The question is, can a program be written to compute the time complexity of such an algorithm through static analysis? If the input algorithm does not terminate, the program behaviour is undefined (it may crash, return a lie, or fail to terminate).

like image 895
singpolyma Avatar asked Nov 14 '12 00:11

singpolyma


2 Answers

You can't be 100% sure you're getting a correct answer from any technique to estimate the complexity based on real-world running time. This is because the exact running time can involve a really complex function, meaning the running time can theoretically follow any other function while the input size is below some really really big number. The running time only needs to tend towards (some multiple of) the complexity function as the input size tends towards infinity. This assumes you want to find a tight bound (which exists for many, but not all, algorithms) and not just an upper or lower bound.

But you can come up with some reasonable estimate of the complexity that should generally be quite accurate.

Also note that a number of algorithms have different running times for different inputs of the same size. You could try running the below for a few different inputs of the same size and averaging the result to mitigate this. This would also help mitigate system conditions that may affect the running time. Although you may not be able to estimate the complexity of the worst and best cases if you don't know the specific input to use for those (as they may be too rare for you to get them while passing in random data).

How to do this:

Record the time for some sufficiently large and sufficiently different sized inputs (e.g. you can run it for inputs of sizes equal to different powers of 10, like 100, 1000 and 10000, and these should be big enough for it to run for at least a few seconds to make the data less noisy). Let's use 3 input sizes. You strictly speaking only need 2 input sizes, but you can use 3 or more as additional verification.

Now we can try to map these 3 results we got to one of some set of complexities like O(1), O(log(n)), O(sqrt(n)), O(n), O(n log n), O(n2), O(n3), etc.

If you're trying to match it manually, you could put the running times you got on a graph along with the graphs of each of the above function (scaled appropriately) and see which one matches best.

If you're trying to automate it, you can try to map each of the functions to the input size and see how close it matches.

There are better ways to do this, but one really simple way to do it would be as follows:

Say you have these running times:

input size   running time
100          21 seconds
1000         29 seconds
10000        40 seconds

Now you can try to match one of those (say the biggest one, which is likely to be the most accurate) to a multiple of one of the above functions.

O(n):     k x n     = k x 10000     = 40,    k = 40 / 10000     = 0.004
O(log n): k x log n = k x log 10000 = 40,    k = 40 / log 10000 = 10
O(n²):    k x n²    = k x 10000²    = 40,    k = 40 / 10000²    = 0.0000004

Now compare what the equation gives you to what the actual running time is for the other input sizes:

For n = 1000, actual running time = 29 seconds
O(n):     0.004 x 1000      = 4 seconds
O(log n): 10 x log 1000     = 30 seconds
O(n²):    0.0000004 x 1000² = 0.4 seconds

For n = 100, actual running time = 21 seconds
O(n):     0.004 x 100      = 0.4 seconds
O(log n): 10 x log 100     = 20 seconds
O(n²):    0.0000004 x 100² = 0.004 seconds

Looking at this, we can clearly see that O(log n) is the closest, with the actual and predicted running times differing by only 1 second in both cases. So that would be our guess for the complexity.

like image 32
Bernhard Barker Avatar answered Oct 22 '22 14:10

Bernhard Barker


I finally asked in the right place, and got the answer. No.

https://cstheory.stackexchange.com/questions/14969/algorithmically-compute-a-reasonable-bound-on-the-runtime-of-an-algorithm#comment40524_14969

like image 129
singpolyma Avatar answered Oct 22 '22 14:10

singpolyma