Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I check if a string is entirely made of the same substring?

People also ask

How do you check if a string contains a specific substring?

You can use contains(), indexOf() and lastIndexOf() method to check if one String contains another String in Java or not. If a String contains another String then it's known as a substring. The indexOf() method accepts a String and returns the starting position of the string if it exists, otherwise, it will return -1.

How do you check if a string is all the same character?

To find whether a string has all the same characters. Traverse the whole string from index 1 and check whether that character matches the first character of the string or not. If yes, then match until string size. If no, then break the loop.

How do I find a repeated pattern in a string?

If you use regex, you only need one line: String repeated = str. replaceAll("(.


There’s a nifty little theorem about strings like these.

A string consists of the same pattern repeated multiple times if and only if the string is a nontrivial rotation of itself.

Here, a rotation means deleting some number of characters from the front of the string and moving them to the back. For example, the string hello could be rotated to form any of these strings:

hello (the trivial rotation)
elloh 
llohe 
lohel 
ohell 

To see why this works, first, assume that a string consists of k repeated copies of a string w. Then deleting the first copy of the repeated pattern (w) from the front of the string and tacking it onto the back will give back the same string. The reverse direction is a bit trickier to prove, but the idea is that if you rotate a string and get back what you started with, you can apply that rotation repeatedly to tile the string with multiple copies of the same pattern (that pattern being the string you needed to move to the end to do the rotation).

Now the question is how to check whether this is the case. For that, there’s another beautiful theorem we can use:

If x and y are strings of the same length, then x is a rotation of y if and only if x is a substring of yy.

As an example, we can see that lohel is a rotation of hello as follows:

hellohello
   ^^^^^

In our case, we know that every string x will always be a substring of xx (it’ll appear twice, once at each copy of x). So basically we just need to check if our string x is a substring of xx without allowing it to match at the first or halfway character. Here’s a one-liner for that:

function check(str) {
    return (str + str).indexOf(str, 1) !== str.length;
}

Assuming indexOf is implemented using a fast string matching algorithm, this will run in time O(n), where n is the length of the input string.

Hope this helps!


You can do it by a capturing group and backreference. Just check it's the repetition of the first captured value.

function check(str) {
  return /^(.+)\1+$/.test(str)
}

console.log(check('aa')) //true
console.log(check('aaa')) //true
console.log(check('abcabcabc')) //true
console.log(check('aba')) //false
console.log(check('ababa')) //false

In the above RegExp:

  1. ^ and $ stands for start and end anchors to predict the position.
  2. (.+) captures any pattern and captures the value(except \n).
  3. \1 is backreference of first captured value and \1+ would check for repetition of captured value.

Regex explanation here

For RegExp debugging use: https://regex101.com/r/pqlAuP/1/debugger

Performance : https://jsperf.com/reegx-and-loop/13


Perhaps the fastest algorithmic approach is building a Z-function in linear time:

The Z-function for this string is an array of length n where the i-th element is equal to the greatest number of characters starting from the position i that coincide with the first characters of s.

In other words, z[i] is the length of the longest common prefix between s and the suffix of s starting at i.

C++ implementation for reference:

vector<int> z_function(string s) {
    int n = (int) s.length();
    vector<int> z(n);
    for (int i = 1, l = 0, r = 0; i < n; ++i) {
        if (i <= r)
            z[i] = min (r - i + 1, z[i - l]);
        while (i + z[i] < n && s[z[i]] == s[i + z[i]])
            ++z[i];
        if (i + z[i] - 1 > r)
            l = i, r = i + z[i] - 1;
    }
    return z;
}

JavaScript implementation
Added optimizations - building a half of z-array and early exit

function z_function(s) {
  var n = s.length;
  var z = Array(n).fill(0);
  var i, l, r;
  //for our task we need only a half of z-array
  for (i = 1, l = 0, r = 0; i <= n/2; ++i) {
    if (i <= r)
      z[i] = Math.min(r - i + 1, z[i - l]);
    while (i + z[i] < n && s[z[i]] == s[i + z[i]])
      ++z[i];

      //we can check condition and return here
     if (z[i] + i === n && n % i === 0) return true;
    
    if (i + z[i] - 1 > r)
      l = i, r = i + z[i] - 1;
  }
  return false; 
  //return z.some((zi, i) => (i + zi) === n && n % i === 0);
}
console.log(z_function("abacabacabac"));
console.log(z_function("abcab"));

Then you need to check indexes i that divide n. If you find such i that i+z[i]=n then the string s can be compressed to the length i and you can return true.

For example, for

string s= 'abacabacabac'  with length n=12`

z-array is

(0, 0, 1, 0, 8, 0, 1, 0, 4, 0, 1, 0)

and we can find that for

i=4
i+z[i] = 4 + 8 = 12 = n
and
n % i = 12 % 4 = 0`

so s might be represented as substring of length 4 repeated three times.


I read the answer of gnasher729 and implemented it. The idea is that if there are any repetitions, then there must be (also) a prime number of repetitions.

function* primeFactors (n) {
    for (var k = 2; k*k <= n; k++) {
        if (n % k == 0) {
            yield k
            do {n /= k} while (n % k == 0)
        }
    }
    if (n > 1) yield n
}

function check (str) {
    var n = str.length
    primeloop:
    for (var p of primeFactors(n)) {
        var l = n/p
        var s = str.substring(0, l)
        for (var j=1; j<p; j++) {
            if (s != str.substring(l*j, l*(j+1))) continue primeloop
        }
        return true
    }
    return false
}

A slightly different algorithm is this:

function check (str) {
    var n = str.length
    for (var p of primeFactors(n)) {
        var l = n/p
        if (str.substring(0, n-l) == str.substring(l)) return true
    }
    return false
}

I've updated the jsPerf page that contains the algorithms used on this page.


Assume the string S has length N and is made of duplicates of the substring s, then the length of s divides N. For example, if S has length 15, then the substring has length 1, 3, or 5.

Let S be made of (p*q) copies of s. Then S is also made of p copies of (s, repeated q times). We have therefore two cases: If N is prime or 1, then S can only be made of copies of the substring of length 1. If N is composite, then we only need to check substrings s of length N / p for primes p dividing the length of S.

So determine N = the length of S, then find all its prime factors in time O (sqrt (N)). If there is only one factor N, check if S is the same string repeated N times, otherwise for each prime factor p, check if S consists of p repeations of the first N / p characters.


I think a recursive function might be very fast as well. The first observation is that the maximum repeated pattern length is half as long as the total string. And we could just test all possible repeated pattern lengths: 1, 2, 3, ..., str.length/2

The recursive function isRepeating(p,str) tests if this pattern is repeated in str.

If str is longer than the pattern, the recursion requires the first part (same length as p) to be a repetition as well as the remainder of str. So str is effectively broken up into pieces of length p.length.

If the tested pattern and str are of equal size, recursion ends here, successfully.

If the length is different (happens for "aba" and pattern "ab") or if the pieces are different, then false is returned, propagating up the recursion.

function check(str)
{
  if( str.length==1 ) return true; // trivial case
  for( var i=1;i<=str.length/2;i++ ) { // biggest possible repeated pattern has length/2 characters

    if( str.length%i!=0 ) continue; // pattern of size i doesn't fit
    
    var p = str.substring(0, i);
    if( isRepeating(p,str) ) return true;
  }
  return false;
}


function isRepeating(p, str)
{
  if( str.length>p.length ) { // maybe more than 2 occurences

    var left = str.substring(0,p.length);
    var right = str.substring(p.length, str.length);
    return left===p && isRepeating(p,right);
  }
  return str===p; 
}

console.log(check('aa')) //true
console.log(check('aaa')) //true 
console.log(check('abcabcabc')) //true
console.log(check('aba')) //false
console.log(check('ababa')) //false

Performance: https://jsperf.com/reegx-and-loop/13