I was trying to find the longest palindrome in a string. The brute force solution takes O(n^3) time. I read that there is a linear time algorithm for it using suffix trees. I am familiar with suffix trees and am comfortable building them. How do you use the built suffix tree to find the longest palindrome.
Manacher's algorithm is used to find the longest palindromic substring in any given string.
Explanation: Palindrome is a string that is same when reading forward as well as backward. The time complexity for finding the longest palindromic substring in a string by using generalized suffix tree is linear time.
We build a suffix tree by following each suffix and creating an edge for each character, starting with a top node. If the new suffix to be put in the tree begins with a set of characters that are already in the tree, we follow those characters until we have a different one, creating a new branch.
The Linear solution can be found in this Way ::
Prequisities:
(1).You must know how to construct the suffix array in O(N) or O(NlogN) time.
(2).You must know how to find the standard LCP Array ie. LCP between adjacent Suffixes i and i-1
ie . LCP [i]=LCP(suffix i in sorted array, suffix i-1 in sorted array) for (i>0).
Let S be the Original String and S' be the reverse of Original String. Lets take S="banana" as an example. Then its Reverse string S'=ananab.
Step 1: Concatenate S + # + S' to get String Str ,where # is an alphabet not present in original String.
Concatenated String Str=S+#+S' Str="banana#ananab"
Step 2: Now construct the Suffix Array of the string Str.
In this example ,the suffix array is:
Suffix Number Index Sorted Suffix 0 6 #ananab 1 5 a#ananab 2 11 ab 3 3 ana#ananab 4 9 anab 5 1 anana#ananab 6 7 ananab 7 12 b 8 0 banana#ananab 9 4 na#ananab 10 10 nab 11 2 nana#ananab 12 8 nanab
Please Note that a suffix array is an array of integers giving the starting positions of suffixes of a string in lexicographical order.So the Array that holds Index of starting position is a suffix Array.
That is SuffixArray[]={6,5,11,3,9,1,7,12,0,4,10,2,8};
Step 3: As you had managed to construct the Suffix Array ,Now find the Longest Common Prefixes Between the adjacent suffixes.
LCP between #ananab a#ananab is :=0 LCP between a#ananab ab is :=1 LCP between ab ana#ananab is :=1 LCP between ana#ananab anab is :=3 LCP between anab anana#ananab is :=3 LCP between anana#ananab ananab is :=5 LCP between ananab b is :=0 LCP between b banana#ananab is :=1 LCP between banana#ananab na#ananab is :=0 LCP between na#ananab nab is :=2 LCP between nab nana#ananab is :=2 LCP between nana#ananab nanab is :=4
Thus LCP array LCP={0,0,1,1,3,3,5,0,1,0,2,2,4}.
Where LCP[i]=Length of Longest Common Prefix between Suffix i and suffix (i-1). (for i>0)
Step 4:
Now you have constructed a LCP array ,Use the following Logic.
Let the length of the Longest Palindrome ,longestlength:=0 (Initially) Let Position:=0. for(int i=1;i<Len;++i) { //Note that Len=Length of Original String +"#"+ Reverse String if((LCP[i]>longestlength)) { //Note Actual Len=Length of original Input string . if((suffixArray[i-1]<actuallen && suffixArray[i]>actuallen)||(suffixArray[i]<actuallen && suffixArray[i-1]>actuallen)) { //print :Calculating Longest Prefixes b/w suffixArray[i-1] AND suffixArray[i] longestlength=LCP[i]; //print The Longest Prefix b/w them is .. //print The Length is :longestlength:=LCP[i]; Position=suffixArray[i]; } } } So the length of Longest Palindrome :=longestlength; and the longest palindrome is:=Str[position,position+longestlength-1];
Execution Example ::
actuallen=Length of banana:=6 Len=Length of "banana#ananab" :=13. Calculating Longest Prefixes b/w a#ananab AND ab The Longest Prefix b/w them is :a The Length is :longestlength:= 1 Position:= 11 Calculating Longest Prefixes b/w ana#ananab AND anab The Longest Prefix b/w them is :ana The Length is :longestlength:= 3 Position:=9 Calculating Longest Prefixes b/w anana#ananab AND ananab The Longest Prefix b/w them is :anana The Length is :longestlength:= 5 Position:= 7 So Answer =5. And the Longest Palindrome is :=Str[7,7+5-1]=anana
Just Make a Note ::
The if condition in Step 4 basically refers that ,in each iteration(i) ,if I take the suffixes s1(i) and s2(i-1) then ,"s1 must contains # and s2 must not contain # " OR "s2 must contains # and s1 must not contains # ".
|(1:BANANA#ANANAB)|leaf tree:| | | | |(7:#ANANAB)|leaf | | |(5:NA)| | | | |(13:B)|leaf | |(3:NA)| | | |(7:#ANANAB)|leaf | | | | | |(13:B)|leaf |(2:A)| | |(7:#ANANAB)|leaf | | | |(13:B)|leaf | | | |(7:#ANANAB)|leaf | |(5:NA)| | | |(13:B)|leaf |(3:NA)| | |(7:#ANANAB)|leaf | | | |(13:B)|leaf | |(7:#ANANAB)|leaf
I believe you need to proceed this way:
Let y1y2 ... yn be your string (where yi are letters).
Create the generalized suffix tree of Sf = y1y2 ... yn$ and Sr = ynyn - 1 ... y1# (reverse the letters and choose different ending characters for Sf ($) and Sr (#))... where Sf stands for "String, Forward" and Sr stands for "String, Reverse".
For every suffix i in Sf, find the lowest common ancestor with the suffix n - i + 1 in Sr.
What runs from the root till this lowest common ancestor is a palindrome, because now the lowest common ancestor represents the longest common prefix of these two suffixes. Recall that:
(1) A prefix of a suffix is a substring.
(2) A palindrome is a string identical to its reverse.
(3) So the longest contained palindrome within a string is exactly the longest common substring of this string and its reverse.
(4) Thus, the longest contained palindrome within a string is exactly the longest common prefix of all pairs of suffixes between a string and its reverse. This is what we're doing here.
EXAMPLE
Let's take the word banana.
Sf = banana$
Sr = ananab#
Below is the generalised suffix tree of Sf and Sr, where the number at the end of each path is the index of the corresponding suffix. There's a small mistake, the a common to all the 3 branches of Blue_4's parent should be on its entering edge, beside n:
The lowest interior node in the tree is the longest common substring of this string and its reverse. Looking at all the interior nodes in the tree you will therefore find the longest palindrome.
The longest palindrome is found between between Green_0 and Blue_1 (i.e., banana and anana) and is anana
EDIT
I've just found this paper that answers this question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With