Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the runtime complexity (big O) of the following pseudocode?

I recently had a very, very intense debate about the runtime complexity of a super simple algorithm with a colleague of mine. In the end we both agreed to disagree but as I've been thinking about this, it's challenged my basic understanding of computer science fundamentals and so I therefore must get additional insight on the matter.

Given the following python, what is the Big-O runtime complexity:

for c in "How are you today?":
    print c

Now, I immediately called out that this is simply on the order of O(n) aka linear. Meaning it's dependent on the length of the string so therefore this loop will grow linearly as the length of the string grows.

My colleague then said, "No, it's constant because we know that for the set of all strings we are dealing with (in our case), the max string is always 255 characters long (in our case), therefore it must be constant." He followed on by saying "because we have a max upper-bound on character length of the string this results in O(255) which reduces to O(1)."

Anyways, we went back and fourth and after 45 minutes of both of us drawing sketches we both dead-locked on the issue.

My question is in what world or what math system is the loop above a constant time loop? If we knew our upper-bound was say 1,000,000 characters and the set of all strings could be anywhere from 0 to 1,000,000 this loop will obviously exhibit linear running times depending on the size of the string.

I additionally asked him if he also thinks the following code is O(1) if the upper-bound size of n is known. Meaning we are certain this code will only ever operate on a max upper-bound of say 255 characters:

s = "How are you today?"
for c in s:
    for d in s:
        print c+d

He said this is also constant time....even after I explained this is an O(n^2) algorithm and demonstrated that the following code would produce a quadratic curve.

So, am I missing some theoretical concept where any of the above is true depending on how the theory goes? Just to be clear his understanding is that I am correct if n is not known. If the upper-bound of n is always known he is asserting that the two algorithms on this post are both of constant runtime complexity.

Just looking to maintain my sanity, but perhaps if I'm wrong there's certainly some additional learning I can benefit from. My good, good colleague was very convincing. Also, if anybody has additional links or material on the subject specific to this question please add to the comments.

like image 825
Ralph Caraveo Avatar asked Sep 26 '15 05:09

Ralph Caraveo


3 Answers

Applying Big-O notation to a single scenario in which all the inputs are known is ludicrous. There is no Big-O for a single case.

The whole point is to get a worst-case estimate for arbitrarily large, unknown values of n. If you already know the exact answer, why on Earth would you waste time trying to estimate it?

Mathy / Computer-Sciencey Edit:

Big-O notation is defined as n grows arbitrarily large: f(n) is O(g(n)) if g(n) ≥ c * f(n), for any constant c, for all n greater than some nMin. Meaning, your "opponent" can set c to "eleventy-quadjillion" and it doesn't matter, because, for all points "to the right" of some point nMin, the graph of "eleventy-quadjillion times f(n)" will lag below g(n)... forever.

Example: 2n is less than or equal to n2... for a short segment of the x-axis that includes n = 2, 3, and 4 (at n = 3, 2n is 8, while n2 is 9). This doesn't change the fact that their Big-O relationship is the opposite: O(2n) is much greater than O(n2), because Big-O says nothing about n values less than nMin. If you set nMin to 4 (thus ignoring the graph to the left of 4), you'll see that the n2 line never exceeds the 2n line.

If your "opponent" multiplies n2 by some larger constant c to raise "his" n2 line above your 2n line, you haven't lost yet... you just slide nMin to the right a bit. Big-O says that no matter how big he makes c, you can always find a point after which his equation loses and yours wins, forever.

But, if you constrain n on the right, you've violated the prerequisites for any kind of Big-O analysis. In your argument with your co-worker, one of you invented an nMax, and then the other set nMin somewhere to the right of it --- surprise, the results are nonsensical.

For instance, the first algorithm you showed does indeed do about n work for inputs of length n... in the general case. If I were building my own algorithm that called it n times, I would have to consider mine a quadratic O(n2) algorithm... again, in the general case.

But if I could prove that I would never call your algorithm with an input greater than say 10 (meaning I had more information, and could thus estimate my algorithm more precisely), using Big-O to estimate your algorithm's performance would be throwing away what I'd learned about its actual behavior in the case I care about. I should instead replace your algorithm with a suitably large constant --- changing my algorithm from c * n2 to c * 10 * n... which is just cBigger * n. I could honestly claim my algorithm is linear, because in this case, your algorithm's graph will never rise above that constant value. This would change nothing about the Big-O performance of your algorithm, because Big-O is not defined for constrained cases like this.

To wrap up: In general, that first algorithm you showed was linear by Big-O standards. In a constrained case, where the maximum input is known, it is a mistake to speak of it in Big-O terms at all. In a constrained case, it could legitimately be replaced by some constant value when discussing the Big-O behavior of some other algorithm, but that says absolutely nothing about the Big-O behavior of the first algorithm.

In conclusion: O(Ackermann(n)) works fine when nMax is small enough. Very, very small enough...

like image 186
Kevin J. Chase Avatar answered Nov 11 '22 20:11

Kevin J. Chase


In your case...

I am tempted to say that your friend is softly wrong. And that's because of the considerably big additional constant of 256 in O(1) run time. Your friend said that the execution was O(256). And because we ignore the constants in Big-O, we simply call O(256 * 1) as O(1). It is up to you to decide whether this constant is negligible for you or not.


I have two strong reasons to say that you are right:

Firstly, for various values of n, your answer of O(n) (in first code) gives a better approximation of the running-time. For example:

  1. For a string of length 4: you say run-time is proportional to 4, while you friend says it is proportional to 1 (or 256).
  2. For string of length 255: you say the running time is proportional to 255 while your friend again says that it is constant time.

Clearly, your answer is more accurate in every case, even though his answer is not outright wrong.

Secondly, if you go by your friend's method, then in one sense you can cheat and say that since no string can go beyond your RAM + disk size, therefore all processing is in O(1). And that's when the fallacy of your friend's reasoning becomes visible. Yes he is right that running time (assuming 1TB hard disk and 8 GB RAM) is O((1TB + 8GB) *1) = O(1), but you simply cannot ignore the size of your constant in this scenario.


The Big-O complexity does not tell the actual time of execution, but just the simplistic rate of growth of the running time as the value of n increases.

like image 45
displayName Avatar answered Nov 11 '22 20:11

displayName


I think you're both right.

The runtime of the first algorithm is linear in the size of its input. However, if its input is fixed, then its runtime is also fixed.

Big O is all about measuring the behavior of an algorithm as its input changes. If the input never changes, then Big O is meaningless.

Also: O(n) means that the upper bound of complexity is N. If you want to represent a tight bound then the more precise notation is Θ(n) (theta notation).

like image 44
Daniel Pryden Avatar answered Nov 11 '22 21:11

Daniel Pryden