Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Determining the complexities given codes

Given a snipplet of code, how will you determine the complexities in general. I find myself getting very confused with Big O questions. For example, a very simple question:

for (int i = 0; i < n; i++) {     for (int j = 0; j < n; j++) {         System.out.println("*");     } } 

The TA explained this with something like combinations. Like this is n choose 2 = (n(n-1))/2 = n^2 + 0.5, then remove the constant so it becomes n^2. I can put int test values and try but how does this combination thing come in?

What if theres an if statement? How is the complexity determined?

for (int i = 0; i < n; i++) {     if (i % 2 ==0) {         for (int j = i; j < n; j++) { ... }     } else {         for (int j = 0; j < i; j++) { ... }     } } 

Then what about recursion ...

int fib(int a, int b, int n) {     if (n == 3) {         return a + b;     } else {         return fib(b, a+b, n-1);     } } 
like image 468
Jiew Meng Avatar asked Oct 25 '11 06:10

Jiew Meng


People also ask

What does complexity mean in coding?

Programming complexity (or software complexity) is a term that includes many properties of a piece of software, all of which affect internal interactions. According to several commentators, there is a distinction between the terms complex and complicated.


2 Answers

In general, there is no way to determine the complexity of a given function

Warning! Wall of text incoming!

1. There are very simple algorithms that no one knows whether they even halt or not.

There is no algorithm that can decide whether a given program halts or not, if given a certain input. Calculating the computational complexity is an even harder problem since not only do we need to prove that the algorithm halts but we need to prove how fast it does so.

//The Collatz conjecture states that the sequence generated by the following // algorithm always reaches 1, for any initial positive integer. It has been // an open problem for 70+ years now. function col(n){     if (n == 1){         return 0;     }else if (n % 2 == 0){ //even         return 1 + col(n/2);     }else{ //odd         return 1 + col(3*n + 1);     } } 

2. Some algorithms have weird and off-beat complexities

A general "complexity determining scheme" would easily get too complicated because of these guys

//The Ackermann function. One of the first examples of a non-primitive-recursive algorithm. function ack(m, n){     if(m == 0){         return n + 1;     }else if( n == 0 ){         return ack(m-1, 1);     }else{         return ack(m-1, ack(m, n-1));     } }  function f(n){ return ack(n, n); }  //f(1) = 3 //f(2) = 7 //f(3) = 61 //f(4) takes longer then your wildest dreams to terminate. 

3. Some functions are very simple but will confuse lots of kinds of static analysis attempts

//Mc'Carthy's 91 function. Try guessing what it does without // running it or reading the Wikipedia page ;) function f91(n){     if(n > 100){         return n - 10;     }else{         return f91(f91(n + 11));     } } 

That said, we still need a way to find the complexity of stuff, right? For loops are a simple and common pattern. Take your initial example:

for(i=0; i<N; i++){    for(j=0; j<i; j++){        print something    } } 

Since each print something is O(1), the time complexity of the algorithm will be determined by how many times we run that line. Well, as your TA mentioned, we do this by looking at the combinations in this case. The inner loop will run (N + (N-1) + ... + 1) times, for a total of (N+1)*N/2.

Since we disregard constants we get O(N2).

Now for the more tricky cases we can get more mathematical. Try to create a function whose value represents how long the algorithm takes to run, given the size N of the input. Often we can construct a recursive version of this function directly from the algorithm itself and so calculating the complexity becomes the problem of putting bounds on that function. We call this function a recurrence

For example:

function fib_like(n){     if(n <= 1){         return 17;     }else{         return 42 + fib_like(n-1) + fib_like(n-2);     }  } 

it is easy to see that the running time, in terms of N, will be given by

T(N) = 1 if (N <= 1) T(N) = T(N-1) + T(N-2) otherwise 

Well, T(N) is just the good-old Fibonacci function. We can use induction to put some bounds on that.

For, example, Lets prove, by induction, that T(N) <= 2^n for all N (ie, T(N) is O(2^n))

  • base case: n = 0 or n = 1
    T(0) = 1 <= 1 = 2^0     T(1) = 1 <= 2 = 2^1 
  • inductive case (n > 1):
    T(N) = T(n-1) + T(n-2)     aplying the inductive hypothesis in T(n-1) and T(n-2)...     T(N) <= 2^(n-1) + 2^(n-2)     so..     T(N) <= 2^(n-1) + 2^(n-1)          <= 2^n 

(we can try doing something similar to prove the lower bound too)

In most cases, having a good guess on the final runtime of the function will allow you to easily solve recurrence problems with an induction proof. Of course, this requires you to be able to guess first - only lots of practice can help you here.

And as f final note, I would like to point out about the Master theorem, the only rule for more difficult recurrence problems I can think of now that is commonly used. Use it when you have to deal with a tricky divide and conquer algorithm.


Also, in your "if case" example, I would solve that by cheating and splitting it into two separate loops that don; t have an if inside.

for (int i = 0; i < n; i++) {     if (i % 2 ==0) {         for (int j = i; j < n; j++) { ... }     } else {         for (int j = 0; j < i; j++) { ... }     } } 

Has the same runtime as

for (int i = 0; i < n; i += 2) {     for (int j = i; j < n; j++) { ... } }  for (int i = 1; i < n; i+=2) {     for (int j = 0; j < i; j++) { ... } } 

And each of the two parts can be easily seen to be O(N^2) for a total that is also O(N^2).

Note that I used a good trick trick to get rid of the "if" here. There is no general rule for doing so, as shown by the Collatz algorithm example

like image 180
hugomg Avatar answered Oct 09 '22 04:10

hugomg


In general, deciding algorithm complexity is theoretically impossible.

However, one cool and code-centric method for doing it is to actually just think in terms of programs directly. Take your example:

for (int i = 0; i < n; i++) {     for (int j = 0; j < n; j++) {         System.out.println("*");     } } 

Now we want to analyze its complexity, so let's add a simple counter that counts the number of executions of the inner line:

int counter = 0; for (int i = 0; i < n; i++) {     for (int j = 0; j < n; j++) {         System.out.println("*");         counter++;     } } 

Because the System.out.println line doesn't really matter, let's remove it:

int counter = 0; for (int i = 0; i < n; i++) {     for (int j = 0; j < n; j++) {         counter++;     } } 

Now that we have only the counter left, we can obviously simplify the inner loop out:

int counter = 0; for (int i = 0; i < n; i++) {     counter += n; } 

... because we know that the increment is run exactly n times. And now we see that counter is incremented by n exactly n times, so we simplify this to:

int counter = 0; counter += n * n; 

And we emerged with the (correct) O(n2) complexity :) It's there in the code :)

Let's look how this works for a recursive Fibonacci calculator:

int fib(int n) {   if (n < 2) return 1;   return fib(n - 1) + fib(n - 2); } 

Change the routine so that it returns the number of iterations spent inside it instead of the actual Fibonacci numbers:

int fib_count(int n) {   if (n < 2) return 1;   return fib_count(n - 1) + fib_count(n - 2); } 

It's still Fibonacci! :) So we know now that the recursive Fibonacci calculator is of complexity O(F(n)) where F is the Fibonacci number itself.

Ok, let's look at something more interesting, say simple (and inefficient) mergesort:

void mergesort(Array a, int from, int to) {   if (from >= to - 1) return;   int m = (from + to) / 2;   /* Recursively sort halves */   mergesort(a, from, m);   mergesort(m, m,    to);   /* Then merge */   Array b = new Array(to - from);   int i = from;   int j = m;   int ptr = 0;   while (i < m || j < to) {     if (i == m || a[j] < a[i]) {       b[ptr] = a[j++];     } else {       b[ptr] = a[i++];     }     ptr++;   }   for (i = from; i < to; i++)     a[i] = b[i - from]; } 

Because we are not interested in the actual result but the complexity, we change the routine so that it actually returns the number of units of work carried out:

int mergesort(Array a, int from, int to) {   if (from >= to - 1) return 1;   int m = (from + to) / 2;   /* Recursively sort halves */   int count = 0;   count += mergesort(a, from, m);   count += mergesort(m, m,    to);   /* Then merge */   Array b = new Array(to - from);   int i = from;   int j = m;   int ptr = 0;   while (i < m || j < to) {     if (i == m || a[j] < a[i]) {       b[ptr] = a[j++];     } else {       b[ptr] = a[i++];     }     ptr++;     count++;   }   for (i = from; i < to; i++) {     count++;     a[i] = b[i - from];   }   return count; } 

Then we remove those lines that do not actually impact the counts and simplify:

int mergesort(Array a, int from, int to) {   if (from >= to - 1) return 1;   int m = (from + to) / 2;   /* Recursively sort halves */   int count = 0;   count += mergesort(a, from, m);   count += mergesort(m, m,    to);   /* Then merge */   count += to - from;   /* Copy the array */   count += to - from;   return count; } 

Still simplifying a bit:

int mergesort(Array a, int from, int to) {   if (from >= to - 1) return 1;   int m = (from + to) / 2;   int count = 0;   count += mergesort(a, from, m);   count += mergesort(m, m,    to);   count += (to - from) * 2;   return count; } 

We can now actually dispense with the array:

int mergesort(int from, int to) {   if (from >= to - 1) return 1;   int m = (from + to) / 2;   int count = 0;   count += mergesort(from, m);   count += mergesort(m,    to);   count += (to - from) * 2;   return count; } 

We can now see that actually the absolute values of from and to do not matter any more, but only their distance, so we modify this to:

int mergesort(int d) {   if (d <= 1) return 1;   int count = 0;   count += mergesort(d / 2);   count += mergesort(d / 2);   count += d * 2;   return count; } 

And then we get to:

int mergesort(int d) {   if (d <= 1) return 1;   return 2 * mergesort(d / 2) + d * 2; } 

Here obviously d on the first call is the size of the array to be sorted, so you have the recurrence for the complexity M(x) (this is in plain sight on the second line :)

M(x) = 2(M(x/2) + x) 

and this you need to solve in order to get to a closed form solution. This you do easiest by guessing the solution M(x) = x log x, and verify for the right side:

2 (x/2 log x/2 + x)         = x log x/2 + 2x         = x (log x - log 2 + 2)         = x (log x - C) 

and verify it is asymptotically equivalent to the left side:

x log x - Cx ------------ = 1 - [Cx / (x log x)] = 1 - [C / log x] --> 1 - 0 = 1. x log x 
like image 44
Antti Huima Avatar answered Oct 09 '22 02:10

Antti Huima