<blockquote> Time limit per test: 5 seconds Memory limit per test: 512 megabytes You are given a string <code>s</code> of length <code>n</code> (<code>n</code> ≤ 5000). You can select any proper prefix of this string that is also its suffix and remove either selected prefix or corresponding suffix. Then you can apply an analogous operation to a resulting string and so on. What is the minimum length of the final string, that can be achieved after applying the optimal sequence of such operations? Input The first line of each test contains a string <code>s</code> that consists of small English letters. Output Output a single integer — the minimum length of the final string, that can be achieved after applying the optimal sequence of such operations. Examples <code> +-------+--------+----------------------------------+ | Input | Output | Explanation | +-------+--------+----------------------------------+ | caaca | 2 | caaca → ca|aca → aca → ac|a → ac | +-------+--------+----------------------------------+ | aabaa | 2 | aaba|a → a|aba → ab|a → ab | +-------+--------+----------------------------------+ | abc | 3 | No operations are possible | +-------+--------+----------------------------------+</code> </blockquote> Here is what I've managed to do so far: <ol> <li>Calculate the prefix function for all substrings of a given string in O(n^2)</li> <li>Check the result of performing all the possible combinations of operations in O(n^3)</li> </ol> My solution passes all the tests at <code>n</code> ≤ 2000 but exceeds the time limit when 2000 < <code>n</code> ≤ 5000. Here is its code: <pre class="prettyprint"><code>#include <iostream> #include <string> using namespace std; const int MAX_N = 5000; int result; // 1 less than actual // [x][y] corresponds to substring that starts at position `x` and ends at position `x + y` => // => corresponding substring length is `y + 1` int lps[MAX_N][MAX_N]; // prefix function for the substring s[x..x+y] bool checked[MAX_N][MAX_N]; // whether substring s[x..x+y] is processed by check function // length is 1 less than actual void check(int start, int length) { checked[start][length] = true; if (length < result) { if (length == 0) { cout << 1; // actual length = length + 1 = 0 + 1 = 1 exit(0); // 1 is the minimum possible result } result = length; } // iteration over all proper prefixes that are also suffixes // i - current prefix length for (int i = lps[start][length]; i != 0; i = lps[start][i - 1]) { int newLength = length - i; int newStart = start + i; if (!checked[start][newLength]) check(start, newLength); if (!checked[newStart][newLength]) check(newStart, newLength); } } int main() { string str; cin >> str; int n = str.length(); // lps calculation runs in O(n^2) for (int l = 0; l < n; l++) { int subLength = n - l; lps[l][0] = 0; checked[l][0] = false; for (int i = 1; i < subLength; ++i) { int j = lps[l][i - 1]; while (j > 0 && str[i + l] != str[j + l]) j = lps[l][j - 1]; if (str[i + l] == str[j + l]) j++; lps[l][i] = j; checked[l][i] = false; } } result = n - 1; // checking all possible operations combinations in O(n^3) check(0, n - 1); cout << result + 1; } </code></pre> Q: Is there any more efficient solution?

Here's one way to get the log factor. Let <code>dp[i][j]</code> be true if we can reach the substring <code>s[i..j]</code>. Then: <pre class="prettyprint"><code>dp[0][length(s)-1] -> true dp[0][j] -> if s[0] != s[j+1]: false else: true if any dp[0][k] for j < k ≤ (j + longestMatchRight[0][j+1]) (The longest match we can use is also bound by the current range.) (Initialise left side similarly.) </code></pre> Now iterate from the outside in: <pre class="prettyprint"><code>for i = 1 to length(s)-2: for j = length(s)-2 to i: dp[i][j] -> // We removed on the right if s[i] != s[j+1]: false else: true if any dp[i][k] for j < k ≤ (j + longestMatchRight[i][j+1]) // We removed on the left if s[i-1] != s[j]: true if dp[i][j] else: true if any dp[k][j] for (i - longestMatchLeft[i-1][j]) ≤ k </pre> We can precompute the longest match for each starting pair <code>(i, j)</code> in <code>O(n^2)</code> with the recurrence, <pre class="prettyprint"><code>longest(i, j) -> if s[i] == s[j]: return 1 + longest(i + 1, j + 1) else: return 0 </code></pre> This would allow us to check for a substring match that starts at indexes <code>i</code> and <code>j</code> in <code>O(1)</code>. (We need both right and left directions.) <h3>How to get the log factor</h3> We can think of a way to come up with a data structure that would allow us to determine if <pre class="prettyprint"><code>any dp[i][k] for j < k ≤ (j + longestMatchRight[i][j+1]) (And similarly for the left side.) </code></pre> in <code>O(log n)</code>, considering we have already seen those values. Here's C++ code with segment trees (for right and left queries, so <code>O(n^2 * log n)</code>) that includes Bananon's test generator. For 5000 "a" characters, it ran in 3.54s, 420 MB (https://ideone.com/EIrhnR). To reduce the memory, one of the segment trees is implemented on a single row (I still need to investigate doing the same with the left side queries to reduce memory even further.) <pre class="prettyprint"><code>#include <iostream> #include <string> #include <ctime> #include <random> #include <algorithm> // std::min using namespace std; const int MAX_N = 5000; int seg[2 * MAX_N]; int segsL[MAX_N][2 * MAX_N]; int m[MAX_N][MAX_N][2]; int dp[MAX_N][MAX_N]; int best; // Adapted from https://codeforces.com/blog/entry/18051 void update(int n, int p, int value) { // set value at position p for (seg[p += n] = value; p > 1; p >>= 1) seg[p >> 1] = seg[p] + seg[p ^ 1]; } // Adapted from https://codeforces.com/blog/entry/18051 int query(int n, int l, int r) { // sum on interval [l, r) int res = 0; for (l += n, r += n; l < r; l >>= 1, r >>= 1) { if (l & 1) res += seg[l++]; if (r & 1) res += seg[--r]; } return res; } // Adapted from https://codeforces.com/blog/entry/18051 void updateL(int n, int i, int p, int value) { // set value at position p for (segsL[i][p += n] = value; p > 1; p >>= 1) segsL[i][p >> 1] = segsL[i][p] + segsL[i][p ^ 1]; } // Adapted from https://codeforces.com/blog/entry/18051 int queryL(int n, int i, int l, int r) { // sum on interval [l, r) int res = 0; for (l += n, r += n; l < r; l >>= 1, r >>= 1) { if (l & 1) res += segsL[i][l++]; if (r & 1) res += segsL[i][--r]; } return res; } // Code by גלעד ברקן void precalc(int n, string & s) { int i, j; for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { // [longest match left, longest match right] m[i][j][0] = (s[i] == s[j]) & 1; m[i][j][1] = (s[i] == s[j]) & 1; } } for (i = n - 2; i >= 0; i--) for (j = n - 2; j >= 0; j--) m[i][j][1] = s[i] == s[j] ? 1 + m[i + 1][j + 1][1] : 0; for (i = 1; i < n; i++) for (j = 1; j < n; j++) m[i][j][0] = s[i] == s[j] ? 1 + m[i - 1][j - 1][0] : 0; } // Code by גלעד ברקן void f(int n, string & s) { int i, j, k, longest; dp[0][n - 1] = 1; update(n, n - 1, 1); updateL(n, n - 1, 0, 1); // Right side initialisation for (j = n - 2; j >= 0; j--) { if (s[0] == s[j + 1]) { longest = std::min(j + 1, m[0][j + 1][1]); for (k = j + 1; k <= j + longest; k++) dp[0][j] |= dp[0][k]; if (dp[0][j]) { update(n, j, 1); updateL(n, j, 0, 1); best = std::min(best, j + 1); } } } // Left side initialisation for (i = 1; i < n; i++) { if (s[i - 1] == s[n - 1]) { // We are bound by the current range longest = std::min(n - i, m[i - 1][n - 1][0]); for (k = i - 1; k >= i - longest; k--) dp[i][n - 1] |= dp[k][n - 1]; if (dp[i][n - 1]) { updateL(n, n - 1, i, 1); best = std::min(best, n - i); } } } for (i = 1; i <= n - 2; i++) { for (int ii = 0; ii < MAX_N; ii++) { seg[ii * 2] = 0; seg[ii * 2 + 1] = 0; } update(n, n - 1, dp[i][n - 1]); for (j = n - 2; j >= i; j--) { // We removed on the right if (s[i] == s[j + 1]) { // We are bound by half the current range longest = std::min(j - i + 1, m[i][j + 1][1]); //for (k=j+1; k<=j+longest; k++) //dp[i][j] |= dp[i][k]; if (query(n, j + 1, j + longest + 1)) { dp[i][j] = 1; update(n, j, 1); updateL(n, j, i, 1); } } // We removed on the left if (s[i - 1] == s[j]) { // We are bound by half the current range longest = std::min(j - i + 1, m[i - 1][j][0]); //for (k=i-1; k>=i-longest; k--) //dp[i][j] |= dp[k][j]; if (queryL(n, j, i - longest, i)) { dp[i][j] = 1; updateL(n, j, i, 1); update(n, j, 1); } } if (dp[i][j]) best = std::min(best, j - i + 1); } } } int so(string s) { for (int i = 0; i < MAX_N; i++) { seg[i * 2] = 0; seg[i * 2 + 1] = 0; for (int j = 0; j < MAX_N; j++) { segsL[i][j * 2] = 0; segsL[i][j * 2 + 1] = 0; m[i][j][0] = 0; m[i][j][1] = 0; dp[i][j] = 0; } } int n = s.length(); best = n; precalc(n, s); f(n, s); return best; } // End code by גלעד ברקן // Code by Bananon ======================================================================= int result; int lps[MAX_N][MAX_N]; bool checked[MAX_N][MAX_N]; void check(int start, int length) { checked[start][length] = true; if (length < result) { result = length; } for (int i = lps[start][length]; i != 0; i = lps[start][i - 1]) { int newLength = length - i; if (!checked[start][newLength]) check(start, newLength); int newStart = start + i; if (!checked[newStart][newLength]) check(newStart, newLength); } } int my(string str) { int n = str.length(); for (int l = 0; l < n; l++) { int subLength = n - l; lps[l][0] = 0; checked[l][0] = false; for (int i = 1; i < subLength; ++i) { int j = lps[l][i - 1]; while (j > 0 && str[i + l] != str[j + l]) j = lps[l][j - 1]; if (str[i + l] == str[j + l]) j++; lps[l][i] = j; checked[l][i] = false; } } result = n - 1; check(0, n - 1); return result + 1; } // generate ================================================================= bool rndBool() { return rand() % 2 == 0; } int rnd(int bound) { return rand() % bound; } void untrim(string & str) { int length = rnd(str.length()); int prefixLength = rnd(str.length()) + 1; if (rndBool()) str.append(str.substr(0, prefixLength)); else { string newStr = str.substr(str.length() - prefixLength, prefixLength); newStr.append(str); str = newStr; } } void rndTest(int minTestLength, string s) { while (s.length() < minTestLength) untrim(s); int myAns = my(s); int soAns = so(s); cout << myAns << " " << soAns << '\n'; if (soAns != myAns) { cout << s; exit(0); } } int main() { int minTestLength; cin >> minTestLength; string seed; cin >> seed; while (true) rndTest(minTestLength, seed); } </code></pre> And here's JavaScript code (without the log factor improvement) to show that the recurrence works. (To get the log factor, we replace the inner <code>k</code> loops with a single range query.) <div class="snippet" data-lang="js" data-hide="false" data-console="true" data-babel="false"> <div class="snippet-code"> <pre class="prettyprint snippet-code-js lang-js prettyprint-override"><code>debug = 1 function precalc(s){ let m = new Array(s.length) for (let i=0; i<s.length; i++){ m[i] = new Array(s.length) for (let j=0; j<s.length; j++){ // [longest match left, longest match right] m[i][j] = [(s[i] == s[j]) & 1, (s[i] == s[j]) & 1] } } for (let i=s.length-2; i>=0; i--) for (let j=s.length-2; j>=0; j--) m[i][j][1] = s[i] == s[j] ? 1 + m[i+1][j+1][1] : 0 for (let i=1; i<s.length; i++) for (let j=1; j<s.length; j++) m[i][j][0] = s[i] == s[j] ? 1 + m[i-1][j-1][0] : 0 return m } function f(s){ m = precalc(s) let n = s.length let min = s.length let dp = new Array(s.length) for (let i=0; i<s.length; i++) dp[i] = new Array(s.length).fill(0) dp[0][s.length-1] = 1 // Right side initialisation for (let j=s.length-2; j>=0; j--){ if (s[0] == s[j+1]){ let longest = Math.min(j + 1, m[0][j+1][1]) for (let k=j+1; k<=j+longest; k++) dp[0][j] |= dp[0][k] if (dp[0][j]) min = Math.min(min, j + 1) } } // Left side initialisation for (let i=1; i<s.length; i++){ if (s[i-1] == s[s.length-1]){ let longest = Math.min(s.length - i, m[i-1][s.length-1][0]) for (let k=i-1; k>=i-longest; k--) dp[i][s.length-1] |= dp[k][s.length-1] if (dp[i][s.length-1]) min = Math.min(min, s.length - i) } } for (let i=1; i<=s.length-2; i++){ for (let j=s.length-2; j>=i; j--){ // We removed on the right if (s[i] == s[j+1]){ // We are bound by half the current range let longest = Math.min(j - i + 1, m[i][j+1][1]) for (let k=j+1; k<=j+longest; k++) dp[i][j] |= dp[i][k] } // We removed on the left if (s[i-1] == s[j]){ // We are bound by half the current range let longest = Math.min(j - i + 1, m[i-1][j][0]) for (let k=i-1; k>=i-longest; k--) dp[i][j] |= dp[k][j] } if (dp[i][j]) min = Math.min(min, j - i + 1) } } if (debug){ let str = "" for (let row of dp) str += row + "\n" console.log(str) } return min } function main(s){ var strs = [ "caaca", "bbabbbba", "baabbabaa", "bbabbba", "bbbabbbbba", "abbabaabbab", "abbabaabbaba", "aabaabaaabaab", "bbabbabbb" ] for (let s of strs){ let t = new Date console.log(s) console.log(f(s)) //console.log((new Date - t)/1000) console.log("") } } main()</code></pre> </div> </div>

Efficient string truncation algorithm, sequentially removing equal prefixes and suffixes

Q: How do you remove prefixes and suffixes in Python?

Use the str. removeprefix() and str. removesuffix() methods to remove the prefix and suffix from a string. The methods take the prefix and suffix as parameters and remove them from the string.

Q: How do I remove a suffix from a string in Python?

Use the str. removesuffix() method to remove the suffix from a string, e.g. without_suffix = my_str. removesuffix('@@@') . The removesuffix() method will return a new string with the specified suffix removed.

Q: What is suffix Python?

A suffix is a letter or group of letters added to the end of a word. Example: Suffix '-ly' is added to 'quick' to form 'quickly'. Given a query, string s , and a list of all possible words, return all words that have s as a suffix.

Q: How do you find the number of prefixes in a string?

Example 1: Input: words = ["a","b","c","ab","bc","abc"], s = "abc" Output: 3 Explanation: The strings in words which are a prefix of s = "abc" are: "a", "ab", and "abc". Thus the number of strings in words which are a prefix of s is 3.

Tags:

c++

performance

string

algorithm

optimization

Time limit per test: 5 seconds
Memory limit per test: 512 megabytes

You are given a string s of length n (n ≤ 5000). You can select any proper prefix of this string that is also its suffix and remove either selected prefix or corresponding suffix. Then you can apply an analogous operation to a resulting string and so on. What is the minimum length of the final string, that can be achieved after applying the optimal sequence of such operations?

Input
The first line of each test contains a string s that consists of small English letters.

Output
Output a single integer — the minimum length of the final string, that can be achieved after applying the optimal sequence of such operations.

Examples +-------+--------+----------------------------------+ | Input | Output | Explanation | +-------+--------+----------------------------------+ | caaca | 2 | caaca → ca|aca → aca → ac|a → ac | +-------+--------+----------------------------------+ | aabaa | 2 | aaba|a → a|aba → ab|a → ab | +-------+--------+----------------------------------+ | abc | 3 | No operations are possible | +-------+--------+----------------------------------+

Here is what I've managed to do so far:

Calculate the prefix function for all substrings of a given string in O(n^2)
Check the result of performing all the possible combinations of operations in O(n^3)

My solution passes all the tests at n ≤ 2000 but exceeds the time limit when 2000 < n ≤ 5000. Here is its code:

#include <iostream>
#include <string>

using namespace std;

const int MAX_N = 5000;

int result; // 1 less than actual

// [x][y] corresponds to substring that starts at position `x` and ends at position `x + y` =>
// => corresponding substring length is `y + 1`
int lps[MAX_N][MAX_N]; // prefix function for the substring s[x..x+y]
bool checked[MAX_N][MAX_N]; // whether substring s[x..x+y] is processed by check function

// length is 1 less than actual
void check(int start, int length) {
    checked[start][length] = true;
    if (length < result) {
        if (length == 0) {
            cout << 1; // actual length = length + 1 = 0 + 1 = 1
            exit(0); // 1 is the minimum possible result
        }
        result = length;
    }
    // iteration over all proper prefixes that are also suffixes
    // i - current prefix length
    for (int i = lps[start][length]; i != 0; i = lps[start][i - 1]) {
        int newLength = length - i;
        int newStart = start + i;
        if (!checked[start][newLength])
            check(start, newLength);
        if (!checked[newStart][newLength])
            check(newStart, newLength);
    }
}

int main()
{
    string str;
    cin >> str;
    int n = str.length();
    // lps calculation runs in O(n^2)
    for (int l = 0; l < n; l++) {
        int subLength = n - l;
        lps[l][0] = 0;
        checked[l][0] = false;
        for (int i = 1; i < subLength; ++i) {
            int j = lps[l][i - 1];
            while (j > 0 && str[i + l] != str[j + l])
                j = lps[l][j - 1];
            if (str[i + l] == str[j + l])  j++;
            lps[l][i] = j;
            checked[l][i] = false;
        }
    }
    result = n - 1;
    // checking all possible operations combinations in O(n^3)
    check(0, n - 1);
    cout << result + 1;
}

Q: Is there any more efficient solution?

908

asked Jan 10 '20 18:01

Bananon

1 Answers

Here's one way to get the log factor. Let dp[i][j] be true if we can reach the substring s[i..j]. Then:

dp[0][length(s)-1] ->
  true

dp[0][j] ->
  if s[0] != s[j+1]:
    false
  else:
    true if any dp[0][k]
      for j < k ≤ (j + longestMatchRight[0][j+1])

  (The longest match we can use is
   also bound by the current range.)

(Initialise left side similarly.)

Now iterate from the outside in:

for i = 1 to length(s)-2:
  for j = length(s)-2 to i:
    dp[i][j] ->
      // We removed on the right
      if s[i] != s[j+1]:
        false
      else:
        true if any dp[i][k]
          for j < k ≤ (j + longestMatchRight[i][j+1])

      // We removed on the left
      if s[i-1] != s[j]:
        true if dp[i][j]
      else:
        true if any dp[k][j]
          for (i - longestMatchLeft[i-1][j]) ≤ k < i

We can precompute the longest match for each starting pair (i, j) in O(n^2) with the recurrence,

longest(i, j) -> 
  if s[i] == s[j]:
    return 1 + longest(i + 1, j + 1)
  else:
    return 0

This would allow us to check for a substring match that starts at indexes i and j in O(1). (We need both right and left directions.)

How to get the log factor

We can think of a way to come up with a data structure that would allow us to determine if

any dp[i][k]
  for j < k ≤ (j + longestMatchRight[i][j+1])

(And similarly for the left side.)

in O(log n), considering we have already seen those values.

Here's C++ code with segment trees (for right and left queries, so O(n^2 * log n)) that includes Bananon's test generator. For 5000 "a" characters, it ran in 3.54s, 420 MB (https://ideone.com/EIrhnR). To reduce the memory, one of the segment trees is implemented on a single row (I still need to investigate doing the same with the left side queries to reduce memory even further.)

#include <iostream>
#include <string>
#include <ctime>
#include <random>
#include <algorithm>    // std::min

using namespace std;

const int MAX_N = 5000;

int seg[2 * MAX_N];
int segsL[MAX_N][2 * MAX_N];
int m[MAX_N][MAX_N][2];
int dp[MAX_N][MAX_N];
int best;

// Adapted from https://codeforces.com/blog/entry/18051
void update(int n, int p, int value) { // set value at position p
  for (seg[p += n] = value; p > 1; p >>= 1)
    seg[p >> 1] = seg[p] + seg[p ^ 1];
}
// Adapted from https://codeforces.com/blog/entry/18051
int query(int n, int l, int r) { // sum on interval [l, r)
  int res = 0;
  for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
    if (l & 1) res += seg[l++];
    if (r & 1) res += seg[--r];
  }
  return res;
}
// Adapted from https://codeforces.com/blog/entry/18051
void updateL(int n, int i, int p, int value) { // set value at position p
  for (segsL[i][p += n] = value; p > 1; p >>= 1)
    segsL[i][p >> 1] = segsL[i][p] + segsL[i][p ^ 1];
}
// Adapted from https://codeforces.com/blog/entry/18051
int queryL(int n, int i, int l, int r) { // sum on interval [l, r)
  int res = 0;
  for (l += n, r += n; l < r; l >>= 1, r >>= 1) {
    if (l & 1) res += segsL[i][l++];
    if (r & 1) res += segsL[i][--r];
  }
  return res;
}

// Code by גלעד ברקן
void precalc(int n, string & s) {
  int i, j;
  for (i = 0; i < n; i++) {
    for (j = 0; j < n; j++) {
      // [longest match left, longest match right]
      m[i][j][0] = (s[i] == s[j]) & 1;
      m[i][j][1] = (s[i] == s[j]) & 1;
    }
  }

  for (i = n - 2; i >= 0; i--)
    for (j = n - 2; j >= 0; j--)
      m[i][j][1] = s[i] == s[j] ? 1 + m[i + 1][j + 1][1] : 0;

  for (i = 1; i < n; i++)
    for (j = 1; j < n; j++)
      m[i][j][0] = s[i] == s[j] ? 1 + m[i - 1][j - 1][0] : 0;
}

// Code by גלעד ברקן
void f(int n, string & s) {
  int i, j, k, longest;

  dp[0][n - 1] = 1;
  update(n, n - 1, 1);
  updateL(n, n - 1, 0, 1);

  // Right side initialisation
  for (j = n - 2; j >= 0; j--) {
    if (s[0] == s[j + 1]) {
      longest = std::min(j + 1, m[0][j + 1][1]);
      for (k = j + 1; k <= j + longest; k++)
        dp[0][j] |= dp[0][k];
      if (dp[0][j]) {
        update(n, j, 1);
        updateL(n, j, 0, 1);
        best = std::min(best, j + 1);
      }
    }
  }

  // Left side initialisation
  for (i = 1; i < n; i++) {
    if (s[i - 1] == s[n - 1]) {
      // We are bound by the current range
      longest = std::min(n - i, m[i - 1][n - 1][0]);
      for (k = i - 1; k >= i - longest; k--)
        dp[i][n - 1] |= dp[k][n - 1];
      if (dp[i][n - 1]) {
        updateL(n, n - 1, i, 1);
        best = std::min(best, n - i);
      }
    }
  }

  for (i = 1; i <= n - 2; i++) {
    for (int ii = 0; ii < MAX_N; ii++) {
      seg[ii * 2] = 0;
      seg[ii * 2 + 1] = 0;
    }
    update(n, n - 1, dp[i][n - 1]);
    for (j = n - 2; j >= i; j--) {
      // We removed on the right
      if (s[i] == s[j + 1]) {
        // We are bound by half the current range
        longest = std::min(j - i + 1, m[i][j + 1][1]);
        //for (k=j+1; k<=j+longest; k++)
        //dp[i][j] |= dp[i][k];
        if (query(n, j + 1, j + longest + 1)) {
          dp[i][j] = 1;
          update(n, j, 1);
          updateL(n, j, i, 1);
        }
      }
      // We removed on the left
      if (s[i - 1] == s[j]) {
        // We are bound by half the current range
        longest = std::min(j - i + 1, m[i - 1][j][0]);
        //for (k=i-1; k>=i-longest; k--)
        //dp[i][j] |= dp[k][j];
        if (queryL(n, j, i - longest, i)) {
          dp[i][j] = 1;
          updateL(n, j, i, 1);
          update(n, j, 1);
        }
      }
      if (dp[i][j])
        best = std::min(best, j - i + 1);
    }
  }
}

int so(string s) {
  for (int i = 0; i < MAX_N; i++) {
    seg[i * 2] = 0;
    seg[i * 2 + 1] = 0;
    for (int j = 0; j < MAX_N; j++) {
      segsL[i][j * 2] = 0;
      segsL[i][j * 2 + 1] = 0;
      m[i][j][0] = 0;
      m[i][j][1] = 0;
      dp[i][j] = 0;
    }
  }
  int n = s.length();
  best = n;
  precalc(n, s);
  f(n, s);
  return best;
}
// End code by גלעד ברקן

// Code by Bananon  =======================================================================

int result;

int lps[MAX_N][MAX_N];
bool checked[MAX_N][MAX_N];

void check(int start, int length) {
  checked[start][length] = true;
  if (length < result) {
    result = length;
  }
  for (int i = lps[start][length]; i != 0; i = lps[start][i - 1]) {
    int newLength = length - i;
    if (!checked[start][newLength])
      check(start, newLength);
    int newStart = start + i;
    if (!checked[newStart][newLength])
      check(newStart, newLength);
  }
}

int my(string str) {
  int n = str.length();
  for (int l = 0; l < n; l++) {
    int subLength = n - l;
    lps[l][0] = 0;
    checked[l][0] = false;
    for (int i = 1; i < subLength; ++i) {
      int j = lps[l][i - 1];
      while (j > 0 && str[i + l] != str[j + l])
        j = lps[l][j - 1];
      if (str[i + l] == str[j + l]) j++;
      lps[l][i] = j;
      checked[l][i] = false;
    }
  }
  result = n - 1;
  check(0, n - 1);
  return result + 1;
}

// generate =================================================================

bool rndBool() {
  return rand() % 2 == 0;
}

int rnd(int bound) {
  return rand() % bound;
}

void untrim(string & str) {
  int length = rnd(str.length());
  int prefixLength = rnd(str.length()) + 1;
  if (rndBool())
    str.append(str.substr(0, prefixLength));
  else {
    string newStr = str.substr(str.length() - prefixLength, prefixLength);
    newStr.append(str);
    str = newStr;
  }
}

void rndTest(int minTestLength, string s) {
  while (s.length() < minTestLength)
    untrim(s);
  int myAns = my(s);
  int soAns = so(s);
  cout << myAns << " " << soAns << '\n';
  if (soAns != myAns) {
    cout << s;
    exit(0);
  }
}

int main() {
  int minTestLength;
  cin >> minTestLength;
  string seed;
  cin >> seed;
  while (true)
    rndTest(minTestLength, seed);
}

And here's JavaScript code (without the log factor improvement) to show that the recurrence works. (To get the log factor, we replace the inner k loops with a single range query.)

debug = 1

function precalc(s){
  let m = new Array(s.length)
  for (let i=0; i<s.length; i++){
    m[i] = new Array(s.length)
    for (let j=0; j<s.length; j++){
      // [longest match left, longest match right]
      m[i][j] = [(s[i] == s[j]) & 1, (s[i] == s[j]) & 1]
    }
  }
  
  for (let i=s.length-2; i>=0; i--)
    for (let j=s.length-2; j>=0; j--)
      m[i][j][1] = s[i] == s[j] ? 1 + m[i+1][j+1][1] : 0

  for (let i=1; i<s.length; i++)
    for (let j=1; j<s.length; j++)
      m[i][j][0] = s[i] == s[j] ? 1 + m[i-1][j-1][0] : 0
  
  return m
}

function f(s){
  m = precalc(s)
  let n = s.length
  let min = s.length
  let dp = new Array(s.length)

  for (let i=0; i<s.length; i++)
    dp[i] = new Array(s.length).fill(0)

  dp[0][s.length-1] = 1
      
  // Right side initialisation
  for (let j=s.length-2; j>=0; j--){
    if (s[0] == s[j+1]){
      let longest = Math.min(j + 1, m[0][j+1][1])
      for (let k=j+1; k<=j+longest; k++)
        dp[0][j] |= dp[0][k]
      if (dp[0][j])
        min = Math.min(min, j + 1)
    }
  }

  // Left side initialisation
  for (let i=1; i<s.length; i++){
    if (s[i-1] == s[s.length-1]){
      let longest = Math.min(s.length - i, m[i-1][s.length-1][0])
      for (let k=i-1; k>=i-longest; k--)
        dp[i][s.length-1] |= dp[k][s.length-1]
      if (dp[i][s.length-1])
        min = Math.min(min, s.length - i)
    }
  }

  for (let i=1; i<=s.length-2; i++){
    for (let j=s.length-2; j>=i; j--){
      // We removed on the right
      if (s[i] == s[j+1]){
        // We are bound by half the current range
        let longest = Math.min(j - i + 1, m[i][j+1][1])
        for (let k=j+1; k<=j+longest; k++)
          dp[i][j] |= dp[i][k]
      }
      // We removed on the left
      if (s[i-1] == s[j]){
        // We are bound by half the current range
        let longest = Math.min(j - i + 1, m[i-1][j][0])
        for (let k=i-1; k>=i-longest; k--)
          dp[i][j] |= dp[k][j]
      }
      if (dp[i][j])
        min = Math.min(min, j - i + 1)
    }
  }

  if (debug){
    let str = ""
    for (let row of dp)
      str += row + "\n"
    console.log(str)
  }

  return min
}

function main(s){
  var strs = [
    "caaca",
    "bbabbbba",
    "baabbabaa",
    "bbabbba",
    "bbbabbbbba",
    "abbabaabbab",
    "abbabaabbaba",
    "aabaabaaabaab",
    "bbabbabbb"
  ]

  for (let s of strs){
    let t = new Date
    console.log(s)
    console.log(f(s))
    //console.log((new Date - t)/1000)
    console.log("")
  }
}

main()

117

answered Oct 05 '22 22:10

גלעד ברקן

Related questions
                            
                                Are elements added to a std::map automatically initialised?
                            
                                log10() performance on Visual Studio 2015 a lot slower than Visual Studio 2013 for x86
                            
                                Why are implicitly and explicitly deleted move constructors treated differently?
                            
                                Unions in C++11: default constructor seems to be deleted
                            
                                Can an lvalue reference non-type template parameter be inferred?
                            
                                Do array elements count as a common initial sequence?
                            
                                Running Ascii regex over non-ASCII characters with UTF-8
                            
                                R: Error in dyn.load(file, DLLpath = DLLpath, ...)
                            
                                can a static constexpr variable be used as a template argument
                            
                                Solutions for dynamic dispatch on unrelated types
                            
                                Inheriting from std::basic_streambuf to write to a socket
                            
                                Why are these matrix transposition times so counter-intuitive?
                            
                                Why has this C++ code an ambiguous method call only on Microsoft compiler?
                            
                                In what cases does a C++ compiler infer noexcept?
                            
                                Are there extensions to let optimizers assume const-ref parameters will stay const?
                            
                                Differences between `boost::any` and `std::any`
                            
                                Different logic produced by clang and gcc for same code. Which is correct?
                            
                                Derived-to-base conversion for incomplete types required by decltype
                            
                                Why doesn't C++ allow implicit list initialization in the conditional operator? [duplicate]
                            
                                Dynamic linking - Linux Vs. Windows

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With