This questions is basically a follow-on from my answer here. I really wanted to say what the Big-O of this algorithm would be, but I wasn't sure my claim was completely sound.
So given two arrays:
B = [ "Hello World!", "Hello Stack Overflow!", "Foo Bar!", "Food is nice...", "Hej" ]
A = [ "World", "Foo" ]
what is the Big O of:
List<string> results = new List<string>();
foreach (string test in B)
{
if (A.Any(a => test.Contains(a))
results.Add(test);
}
I believe it to be somewhere between O(n)
and O(n^2)
as it depends where in the result the Any()
matches...
Every time the outer loop executes, the inner loop executes M times. As a result, the statements in the inner loop execute a total of N * M times. Thus, the complexity is O(N * M).
You can calculate big O like this: Any number of nested loops will add an additional power of 1 to n. So, if we have three nested loops, the big O would be O(n^3). For any number of loops, the big O is O(n^(number of loops)).
The big O of a loop is the number of iterations of the loop into number of statements within the loop.
A for loop is never O(1), order-one time complexity. The for-loop repeats a process n-times, and the complexity is O(n), order-n.
Let length of A
be N
and length of B
be M
. We have two extremal cases:
The worst one:
a => test.Contains(a)
returns false
on every a
, so A.Any
has to scan the entire A
and we have
O(N * M)
The best one:
a => test.Contains(a)
returns true
on the very 1st item of A
and thus A.Any
returns immediately and we have only
O(M)
Actual complexity is somewhere in between (both borders included):
[O(M)..O(M*N)]
It's close, however as others mention it's going to be O(n*m)
as the sizes of each of your collections differ (with the best case being O(n)
and worst being O(n*m)
) otherwise, if you were iterating the same size collection twice, you'd get O(n^2)
.
Taking a Look Behind the Scenes at Any()
You can take a look at the source for the Enumerable.Any()
method to dig into this a bit further. You'll see a foreach
loop to iterate through until it finds a match for the predicate which reinforces your assumption of it being O(n)
:
public static bool Any<TSource>(this IEnumerable<TSource> source, Func<TSource, bool> predicate) {
if (source == null) throw Error.ArgumentNull("source");
if (predicate == null) throw Error.ArgumentNull("predicate");
// Another loop to iterate through the collection until the predicate is matched
foreach (TSource element in source) {
if (predicate(element)) return true;
}
return false;
}
As you can see Any()
by itself is going to be O(n)
for it's length and nesting that inside of an existing loop O(m)
should give you O(n*m)
for the entire code. However, it's possible that it could be as low as O(m)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With