Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Manacher's algorithm (algorithm to find longest palindrome substring in linear time)

After spending about 6-8 hours trying to digest the Manacher's algorithm, I am ready to throw in the towel. But before I do, here is one last shot in the dark: can anyone explain it? I don't care about the code. I want somebody to explain the ALGORITHM.

Here seems to be a place that others seemed to enjoy in explaining the algorithm: http://www.leetcode.com/2011/11/longest-palindromic-substring-part-ii.html

I understand why you would want to transform the string, say, 'abba' to #a#b#b#a# After than I'm lost. For example, the author of the previously mentioned website says the key part of the algorithm is:

                      if P[ i' ] ≤ R – i,                       then P[ i ] ← P[ i' ]                       else P[ i ] ≥ P[ i' ]. (Which we have to expand past                        the right edge (R) to find P[ i ]) 

This seems wrong, because he/she says at one point that P[i] equals 5 when P[i'] = 7 and P[i] is not less or equal to R - i.

If you are not familiar with the algorithm, here are some more links: http://tristan-interview.blogspot.com/2011/11/longest-palindrome-substring-manachers.html (I've tried this one, but the terminology is awful and confusing. First, some things are not defined. Also, too many variables. You need a checklist to recall what variable is referring to what.)

Another is: http://www.akalin.cx/longest-palindrome-linear-time (good luck)

The basic gist of the algorithm is to find the longest palindrome in linear time. It can be done in O(n^2) with a minimum to medium amount of effort. This algorithm is supposed to be quite "clever" to get it down to O(n).

like image 747
user678392 Avatar asked May 06 '12 04:05

user678392


People also ask

Which algorithm is used to find the longest palindromic substring?

Manacher's algorithm is used to find the longest palindromic substring in any given string.

What is the time complexity of Manacher's algorithm?

The time complexity of this algorithm is O(n).

How do you identify a palindrome algorithm?

Algorithm to check whether a number is a palindrome or notInput the number. Find the reverse of the number. If the reverse of the number is equal to the number, then return true. Else, return false.


2 Answers

I agree that the logic isn't quite right in the explanation of the link. I give some details below.

Manacher's algorithm fills in a table P[i] which contains how far the palindrome centered at i extends. If P[5]=3, then three characters on either side of position five are part of the palindrome. The algorithm takes advantage of the fact that if you've found a long palindrome, you can fill in values of P on the right side of the palindrome quickly by looking at the values of P on the left side, since they should mostly be the same.

I'll start by explaining the case you were talking about, and then I'll expand this answer as needed.

R indicates the index of the right side of the palindrome centered at C. Here is the state at the place you indicated:

C=11 R=20 i=15 i'=7 P[i']=7 R-i=5 

and the logic is like this:

if P[i']<=R-i:  // not true else: // P[i] is at least 5, but may be greater 

The pseudo-code in the link indicates that P[i] should be greater than or equal to P[i'] if the test fails, but I believe it should be greater than or equal to R-i, and the explanation backs that up.

Since P[i'] is greater than R-i, the palindrome centered at i' extends past the palindrome centered at C. We know the palindrome centered at i will be at least R-i characters wide, because we still have symmetry up to that point, but we have to search explicitly beyond that.

If P[i'] had been no greater than R-i, then the largest palindrome centered at i' is within the largest palindrome centered at C, so we would have known that P[i] couldn't be any larger than P[i']. If it was, we would have a contradiction. It would mean that we would be able to extend the palindrome centered at i beyond P[i'], but if we could, then we would also be able to extend the palindrome centered at i' due to the symmetry, but it was already supposed to be as large as possible.

This case is illustrated previously:

C=11 R=20 i=13 i'=9 P[i']=1 R-i=7 

In this case, P[i']<=R-i. Since we are still 7 characters away from the edge of the palindrome centered at C, we know that at least 7 characters around i are the same as the 7 characters around i'. Since there was only a one character palindrome around i', there is a one character palindrome around i as well.

j_random_hacker noticed that the logic should be more like this:

if P[i']<R-i then   P[i]=P[i'] else if P[i']>R-i then   P[i]=R-i else P[i]=R-i + expansion 

If P[i'] < R-i, then we know that P[i]==P[i'], since we're still inside the palindrome centered at C.

If P[i'] > R-i, then we know that P[i]==R-i, because otherwise the palindrome centered at C would have extended past R.

So the expansion is really only necessary in the special case where P[i']==R-i, so we don't know if the palindrome at P[i] may be longer.

This is handled in the actual code by setting P[i]=min(P[i'],R-i) and then always expanding. This way of doing it doesn't increase the time complexity, because if no expansion is necessary, the time taken to do the expansion is constant.

like image 126
Vaughn Cato Avatar answered Oct 04 '22 08:10

Vaughn Cato


I have found one of the best explanation so far at the following link:

http://tarokuriyama.com/projects/palindrome2.php

It also has a visualization for the same string example (babcbabcbaccba) used at the first link mentioned in the question.

Apart from this link, i also found the code at

http://algs4.cs.princeton.edu/53substring/Manacher.java.html

I hope it will be helpful to others trying hard to understand the crux of this algorithm.

like image 30
scv Avatar answered Oct 04 '22 08:10

scv