Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Given string s, find the shortest string t, such that, t^m=s

Given string s, find the shortest string t, such that, t^m=s.

Examples:

s="aabbb" => t="aabbb"
s="abab"  => t = "ab"

How fast can it be done?

Of course naively, for every m divides |s|, I can try if substring(s,0,|s|/m)^m = s.

One can figure out the solution in O(d(|s|)n) time, where d(x) is the number of divisors of s. Can it be done more efficiently?

like image 367
Chao Xu Avatar asked Dec 01 '11 20:12

Chao Xu


2 Answers

This is the problem of computing the period of a string. Knuth, Morris and Pratt's sequential string matching algorithm is a good place to get started. This is in a paper entitled "Fast Pattern Matching in Strings" from 1977.

If you want to get fancy with it, then check out the paper "Finding All Periods and Initial Palindromes of a String in Parallel" by Breslauer and Galil in 1991. From their abstract:

An optimal O(log log n) time CRCW-PRAM algorithm for computing all periods of a string is presented. Previous parallel algorithms compute the period only if it is shorter than half of the length of the string. This algorithm can be used to find all initial palindromes of a string in the same time and processor bounds. Both algorithms are the fastest possible over a general alphabet. We derive a lower bound for finding palindromes by a modification of a previously known lower bound for finding the period of a string [3]. When p processors are available the bounds become \Theta(d n p e + log log d1+p=ne 2p).

like image 136
PengOne Avatar answered Nov 08 '22 00:11

PengOne


I really like this thing called the z-algorithm: http://www.utdallas.edu/~besp/demo/John2010/z-algorithm.htm

For every position it calculates the longest substring starting from there, that is also a prefix of the whole string. (in linear time of course).

a   a   b   c   a   a   b   x   a   a   a   z
    1   0   0   3   1   0   0   2   2   1   0

Given this "z-table" it is easy to find all strings that can be exponentiated to the whole thing. Just check for all positions if pos+z[pos] = n.

In our case:

a   b   a   b
    0   2   0

Here pos = 2 gives you 2+z[2] = 4 = n hence the shortest string you can use is the one of length 2.

This will even allow you to find cases where only a prefix of the exponentiated string matches, like:

a   b   c   a
    0   0   1

Here (abc)^2 can be cut down to your original string. But of course, if you don't want matches like this, just go over the divisors only.

like image 43
Thomas Ahle Avatar answered Nov 08 '22 02:11

Thomas Ahle