Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Having a bit of trouble reasoning the formal definition of Big O

My professor recently brushed over the formal definition of Big O:

Big O definition

To be completely honest even after him explaining it to a few different students we all seem to still not understand it at its core. The problems in comprehension mostly occurred with the following examples we went through:

enter image description here

So far my reasoning is as follows:

When you multiply a function's highest term by a constant, you get a new function that eventually surpasses the initial function at a given n. He called this n a "witness" to the function O(g(n))

How is this c term created/found? He mentioned bounds a couple of times but didn't really specify what bounds signify or how to find them/use them.

I think I just need a more solid foundation of the formal definition and how these examples back up the definition.

like image 893
CaliCrunch Avatar asked Nov 06 '22 23:11

CaliCrunch


1 Answers

I think that the way this definition is typically presented in terms of c values and n0's is needlessly confusing. What f(n) being O(g(n)) really means is that when you disregard constant and lower order terms, g(n) is an asymptotic upper bound for f(n) (for a function to g to asymptotically upper bound f just means that past a certain point g is always greater than or equal to f). Put another way, f(n) grows no faster than g(n) as n goes to infinity.

Big O itself is a little confusing, because f(n) = O(g(n)) doesn't mean that g(n) grows strictly faster than f(n). It means when you disregard constant and lower order terms, g(n) grows faster than f(n), or it grows at the same rate (strictly faster would be "little o"). A simple, formal way to put this concept is to say:

enter image description here

That is, for this limit to hold true, the highest order term of f(n) can be at most a constant multiple of the highest order term of g(n). f(n) is O(g(n)) iff it grows no faster than g(n).

For example, f(n) = n is in O(g(n) = n^2), because past a certain point n^2 is always bigger than n. The limit of n^2 over n is positive, so n is in O(n^2)

As another example, f(n) = 5n^2 + 2n is in O(g(n) = n^2), because in the limit, f(n) can only be about 5 times larger than g(n). It's not infinitely bigger: they grow at the same rate. To be precise, the limit of n^2 over 5n^2 + 3n is 1/5, which is more than zero, so 5n^2 + 3n is in O(n^2). Hopefully this limit based definition provides some intuition, as it is completely equivalent mathematically to the provided definition.

enter image description here

Finding a particular constant value c and x value n0 for which the provided inequality holds true is just a particular way of showing that in the limit as n goes to infinity, g(n) grows at least as fast as f(n): that f(n) is in O(g(n)). That is, if you've found a value past which c*g(n) is always greater than f(n), you've shown that f(n) grows no more than a constant multiple (c times) faster than g(n) (if f grew faster than g by more than a constant multiple, finding such a c and n0 would be impossible).

There's no real art to finding a particular c and n0 value to demonstrate f(n) = O(g(n)). They can be literally whatever positive values you need them to be to make the inequality true. In fact, if it is true that f(n) = O(g(n)) then you can pick any value you want for c and there will be some sufficiently large n0 value that makes the inequality true, or, similarly you could pick any n0 value you want, and if you make c big enough the inequality will become true (obeying the restrictions that c and n0 are both positive). That's why I don't really like this formalization of big O: it's needlessly particular and proofs involving it are somewhat arbitrary, distracting away from the main concept which is the behavior of f and g as n goes to infinity.

So, as for how to handle this in practice, using one of the example questions: why is n^2 + 3n in O(n^2)?

Answer: because the limit as n goes to infinity of n^2 / n^2 + 3n is 1, which is greater than 0.

enter image description here

Or, if you're wanting/needing to do it the other way, pick any positive value you want for n0, and evaluate f at that value. f(1) will always be easy enough:

f(1) = 1^2 + 3*1 = 4

Then find the constant you could multiply g(1) by to get the same value as f(1) (or, if not using n0 = 1 use whatever n0 for g that you used for f).

c*g(1) = 4
c*1^2  = 4
c = 4

Then, you just combine the statements into an assertion to show that there exists a positive n0 and a constant c such that cg(n) <= f(n) for all n >= n0.

n^2 + 3n <= (4)n^2 for all n >= 1, implying n^2 + 3n is in O(n^2)

If you're using this method of proof, the above statement you use to demonstrate the inequality should ideally be immediately obvious. If it's not, maybe you want to change your n0 so that the final statement is more clearly true. I think that showing the limit of the ratio g(n)/f(n) is positive is much clearer and more direct if that route is available to you, but it is up to you.

Moving to a negative example, it's quite easy with the limit method to show that f(n) is not in O(g(n)). To do so, you just show that the limit of g(n) / f(n) = 0. Using the third example question: is nlog(n) + 2n in O(n)?

enter image description here

To demonstrate it the other way, you actually have to show that there exists no positive pair of numbers n0, c such that for all n >= n0 f(n) <= cg(n).

Unfortunately showing that f(n) = nlogn + 2n is in O(nlogn) by using c=2, n0=8 demonstrates nothing about whether f(n) is in O(n) (showing a function is in a higher complexity class implies nothing about it not being a lower complexity class).

To see why this is the case, we could also show a(n) = n is in g(n) = nlogn using those same c and n0 values (n <= 2(nlog(n) for all n >= 8, implying n is in O(nlogn))`), and yet a(n)=n clearly is in O(n). That is to say, to show f(n)=nlogn + 2n is not in O(n) with this method, you can't just show that it is in O(nlogn). You would have to show that no matter what n0 you pick, you can never find a c value large enough such that f(n) >= c(n) for all n >= n0. Showing that such a pair of numbers does not exist is not impossible, but relatively speaking it's a tricky thing to do (and would probably itself involve limit equations, or a proof by contradiction).

To sum things up, f(n) is in O(g(n)) if the limit of g(n) over f(n) is positive, which means f(n) doesn't grow any faster than g(n). Similarly, finding a constant c and x value n0 beyond which cg(n) >= f(n) shows that f(n) cannot grow asymptotically faster than g(n), implying that when discarding constants and lower order terms, g(n) is a valid upper bound for f(n).

like image 51
inordirection Avatar answered Nov 15 '22 05:11

inordirection