Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Total number of palindromic subsequences in a string

The question is like this--

For every string given as input, you need to tell the number of subsequences of it that are palindromes (need not necessarily be distinct). Note that the empty string is not a palindrome. For example, the palindromic subsequences of "aab" are:

"a", "a", "b", "aa", and the method returns 4.

I had the Dynamic Programming solution to finding Longest Palindromic Subsequence in mind and therefore tried to take ideas from it. Couldn't really get the solution. May be dynamic programming is not even required. Suggestions please.

And there is one more catch. When the condition "need not necessarily be distinct" is removed, can we still count without actually generating all the palindromic subsequences?

like image 261
discoverAnkit Avatar asked Mar 04 '15 12:03

discoverAnkit


People also ask

How many palindromes are in a string?

For an input string of length n, there would be a total of O(n^2) substrings. Checking each substring to see if it's a palindrome or not would take linear time as we would have to look at each character of the substring exactly once if it is a palindrome.

How do you find the number of palindromes?

A simple method for this problem is to first reverse digits of num, then compare the reverse of num with num. If both are same, then return true, else false.

How do you find a palindrome in a string?

Solution approach We can use the isPalindrome() function to check if a string is a palindrome. We pass our input string as an argument and the function will return true if the string is a palindrome and false otherwise.

How many Substrings are scatter palindrome?

The scatter-palindromes are a,aa,aab,aabb,b,bb,b there are 9 substrings that are scatter-palindromes.


1 Answers

[EDIT 19/10/2015: An anonymous reviewer pointed out a problem with the formula, which prompted me to notice another, even bigger mistake... Now fixed.]

I now see how to drop the solution time down to O(n^2). I'll leave my other answer up in case it's interesting as a stepping-stone to this one. Note: This is (also) only a solution to the first part of the problem; I see no way to efficiently count only distinct palindromic subsequences (PS).

Instead of counting the number of PS that begin and end at exactly the positions i and j, let's count how many begin at or after i and end at or before j. Call this g(i, j).

We can try to write g(i, j) = g(i, j-1) + g(i+1, j) + (x[i] == x[j])*g(i+1, j-1) for the case when j > i. But this doesn't quite work, because the first two terms will double-count any PS that begin after i and end before j.

The key insight is to notice that we can easily calculate the number of PS that begin or end at some exact position by subtracting off other values of g(), and perhaps adding yet more values of g() back on to compensate for double-counting. For example, the number of PS that begin at exactly i and end at exactly j is g(i, j) - g(i+1, j) - g(i, j-1) + g(i+1, j-1): the last term corrects for the fact that both the second and third terms count all g(i+1, j-1) PS that begin after i and end before j.

Every PS that begins at or after i and ends at or before j is in exactly 1 of 4 categories:

  1. It begins after i, and ends before j.
  2. It begins at i, and ends before j.
  3. It begins after i, and ends at j.
  4. It begins at i, and ends at j.

g(i+1, j) counts all PS in category 1 or 3, and g(i, j-1) counts all PS in category 1 or 2, so their sum g(i+1, j) + g(i, j-1) counts all PS in category 2 or 3 once each, and all PS in category 1 twice. Since g(i+1, j-1) counts all PS in category 1 only, subtracting this off to get g(i+1, j) + g(i, j-1) - g(i+1, j-1) gives the total number of PS in category 1, 2 and 3. The remaining PS are those in category 4. If x[i] != x[j] then there are no PS in this category; otherwise, there are exactly as many as there are PS that begin at or after i+1 and end at or before j-1, namely g(i+1, j-1), plus one extra for the 2-character sequence x[i]x[j]. [EDIT: Thanks to commenter Tuxdude for 2 fixes here!]

With this in hand, we can express g() in a way that changes the quadratic case from f() to constant time:

g(i, i) = 1 (i.e. when j = i)
g(i, i+1) = 2 + (x[i] == x[i+1]) (i.e. 3 iff adjacent chars are identical, otherwise 2)
g(i, j) = 0 when j < i (this new boundary case is needed)
g(i, j) = g(i+1, j) + g(i, j-1) - g(i+1, j-1) + (x[i] == x[j])*(g(i+1, j-1)+1) when j >= i+2

The final answer is now simply g(1, n).

like image 148
j_random_hacker Avatar answered Nov 15 '22 13:11

j_random_hacker