Wikipedia claims that the failure function table can be computed in O(n) time.
Let's look at its `canonical' implementation (in C++):
vector<int> prefix_function (string s) {
int n = (int) s.length();
vector<int> pi (n);
for (int i=1; i<n; ++i) {
int j = pi[i-1];
while (j > 0 && s[i] != s[j])
j = pi[j-1];
if (s[i] == s[j]) ++j;
pi[i] = j;
}
return pi;
}
Why does it work in O(n) time, even if there is an inner while-loop? I'm not really strong at the analysis of algorithms, so may somebody explain it?
f - failure function (by definition, this is the length of the longest prefix of the string which is a suffix also) Here how I built it step by step: f(a) = 0 (always = 0 for one letter) f(aa) = 1 (one letter 'a' is both a prefix and suffix) f(aab) = 0 (there is no the same suffixes and prefixes: a != b, aa != ab)
The best case for KMP is O(k) where k is the length of the search term, and happens when the string to be searched inside is of length 0.
There's already two answers here that are correct, but I often think a fully laid out proof can make things clearer. You said you wanted an answer for a 9-year-old, but I don't think it's feasible (I think it's easy to be fooled into thinking it's true without actually having any intuition for why it's true). Maybe working through this answer will help.
First off, the outer loop runs n
times clearly because i
is not modified
within the loop. The only code within the loop that could run more than once is
the block
while (j > 0 && s[i] != s[j])
{
j = pi[j-1]
}
So how many times can that run? Well notice that every time that condition is
satisfied we decrease the value of j
which, at this point, is at most
pi[i-1]
. If it hits 0 then the while
loop is done. To see why this is important,
we first prove a lemma (you're a very smart 9-year-old):
pi[i] <= i
This is done by induction. pi[0] <= 0
since it's set once in the initialization of pi
and never touched again. Then inductively we let 0 < k < n
and assume
the claim holds for 0 <= a < k
. Consider the value of p[k]
. It's set
precisely once in the line pi[i] = j
. Well how big can j
be? It's initialized
to pi[k-1] <= k-1
by induction. In the while block it then may be updated to pi[j-1] <= j-1 < pi[k-1]
. By another mini-induction you can see that j
will never increase past pi[k-1]
. Hence after the
while
loop we still have j <= k-1
. Finally it might be incremented once so we have
j <= k
and so pi[k] = j <= k
(which is what we needed to prove to finish our induction).
Now returning back to the original point, we ask "how many times can we decrease the value of
j
"? Well with our lemma we can now see that every iteration of the while
loop will
monotonically decrease the value of j
. In particular we have:
pi[j-1] <= j-1 < j
So how many times can this run? At most pi[i-1]
times. The astute reader might think
"you've proven nothing! We have pi[i-1] <= i-1
but it's inside the while loop so
it's still O(n^2)
!". The slightly more astute reader notices this extra fact:
However many times we run
j = pi[j-1]
we then decrease the value ofpi[i]
which shortens the next iteration of the loop!
For example, let's say j = pi[i-1] = 10
. But after ~6 iterations of the while
loop we have
j = 3
and let's say it gets incremented by 1 in the s[i] == s[j]
line so j = 4 = pi[i]
.
Well then at the next iteration of the outer loop we start with j = 4
... so we can only execute the while
at most 4 times.
The final piece of the puzzle is that ++j
runs at most once per loop. So it's not like we can have
something like this in our pi
vector:
0 1 2 3 4 5 1 6 1 7 1 8 1 9 1
^ ^ ^ ^ ^
Those spots might mean multiple iterations of the while loop if this
could happen
To make this actually formal you might establish the invariants described above and then use induction
to show that the total number of times that while
loop is run, summed with pi[i]
is at most i
.
From that, it follows that the total number of times the while
loop is run is O(n)
which means that the entire outer loop has complexity:
O(n) // from the rest of the outer loop excluding the while loop
+ O(n) // from the while loop
=> O(n)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With