Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding periodic strings using string functions

I'm looking for a way to check if a string is periodic or not using JavaScript.

Sample string to match can be 11223331122333. Whereas, 10101 should not match.

Coming from python, I used the RegEx

/(.+?)\1+$/ 

But it is quite slow. Are there any string methods that can do the trick?

like image 527
Bhargav Rao Avatar asked Dec 31 '15 13:12

Bhargav Rao


People also ask

How do you find the period of a string?

Calculate the period of oscillations according to the formula above: T = 2π√(L/g) = 2π * √(2/9.80665) = 2.837 s . Find the frequency as the reciprocal of the period: f = 1/T = 0.352 Hz .

How do you know if a string is periodic?

Given a string str and an integer K, the task is to check whether the given string is K-periodic. A string is k-periodic if the string is a repetition of the sub-string str[0 … k-1] i.e. the string “ababab” is 2-periodic. Print Yes if the given string is k-periodic, else print No.

What is a function string?

String functions are used in computer programming languages to manipulate a string or query information about a string (some do both). Most programming languages that have a string datatype will have some string functions although there may be other low-level ways within each language to handle strings directly.

How do you find the first occurrence of a string?

String find is used to find the first occurrence of sub-string in the specified string being called upon. It returns the index of the first occurrence of the substring in the string from given starting position. The default value of starting position is 0. Function Template: size_t find (const string& str, size_t pos = 0);

How do you check if a string is K-periodic?

A string is k-periodic if the string is a repetition of the sub-string str [0 … k-1] i.e. the string “ababab” is 2-periodic. Print Yes if the given string is k-periodic, else print No. Recommended: Please try your approach on {IDE} first, before moving on to the solution.

What happens if string1 == string2?

If string1 == string2 then you would get 0 (zero) when you use this function for compare strings. It compares both the string till n characters or in other words it compares first n characters of both the strings.

How to find the index of first occurrence of a sub-string?

The function returns the index of the first occurrence of sub-string, if the sub-string is not found it returns string::npos (string::pos is static member with value as the highest possible for the size_t data structure). In below syntax, note that c is a character. In below syntax, note that n is number of characters to match.


1 Answers

The idea of the code below is to consider substrings of all lengths the original string can be divided into evenly, and to check whether they repeat across the original string. A simple method is to check all divisors of the length from 1 to the square root of the length. They are divisors if the division yields an integer, which is also a complementary divisor. E.g., for a string of length 100 the divisors are 1, 2, 4, 5, 10, and the complementary divisors are 100 (not useful as substring length because the substring would appear only once), 50, 25, 20 (and 10, which we already found).

function substr_repeats(str, sublen, subcount) {    for (var c = 0; c < sublen; c++) {       var chr = str.charAt(c);       for (var s = 1; s < subcount; s++) {          if (chr != str.charAt(sublen * s + c)) {             return false;          }       }    }    return true; }  function is_periodic(str) {    var len = str.length;    if (len < 2) {       return false;    }    if (substr_repeats(str, 1, len)) {       return true;    }    var sqrt_len = Math.sqrt(len);    for (var n = 2; n <= sqrt_len; n++) { // n: candidate divisor       var m = len / n; // m: candidate complementary divisor       if (Math.floor(m) == m) {          if (substr_repeats(str, m, n) || n != m && substr_repeats(str, n, m)) {             return true;          }       }    }    return false; } 

Unfortunately there is no String method for comparing to a substring of another string in place (e.g., in C language that would be strncmp(str1, str2 + offset, length)).


Say your string has a length of 120, and consists of a substring of length 6 repeated 20 times. You can look at it also as consisting of a sublength (length of substring) 12 repeated 10 times, sublength 24 repeated 5 times, sublength 30 repeated 4 times, or sublength 60 repeated 2 times (the sublengths are given by the prime factors of 20 (2*2*5) applied in different combinations to 6). Now, if you check whether your string contains a sublength of 60 repeated 2 times, and the check fails, you can also be sure that it won't contain any sublength which is a divisor (i.e., a combination of prime factors) of 60, including 6. In other words, many checks made by the above code are redundant. E.g., in the case of length 120, the above code checks (luckily failing quickly most of the time) the following sublengths: 1, 2, 3, 4, 5, 6, 8, 10, 12, 15, 20, 24, 30, 40, 60 (in this order: 1, 60, 2, 40, 3, 30, 4, 24, 5, 20, 6, 15, 8, 12, 10). Of these, only the following are necessary: 24, 40, 60. These are 2*2*2*3, 2*2*2*5, 2*2*3*5, i.e., the combinations of primes of 120 (2*2*2*3*5) with one of each (nonrepeating) prime taken out, or, if you prefer, 120/5, 120/3, 120/2. So, forgetting for a moment that efficient prime factorization is not a simple task, we can restrict our checks of repeating substrings to p substrings of sublength length/p, where p is a prime factor of length. The following is the simplest nontrivial implementation:

function substr_repeats(str, sublen, subcount) { see above }  function distinct_primes(n) {    var primes = n % 2 ? [] : [2];    while (n % 2 == 0) {       n /= 2;    }    for (var p = 3; p * p <= n; p += 2) {       if (n % p == 0) {          primes.push(p);          n /= p;          while (n % p == 0) {             n /= p;          }       }    }    if (n > 1) {       primes.push(n);    }    return primes; }  function is_periodic(str) {    var len = str.length;    var primes = distinct_primes(len);    for (var i = primes.length - 1; i >= 0; i--) {       var sublen = len / primes[i];       if (substr_repeats(str, sublen, len / sublen)) {          return true;       }    }    return false; } 

Trying out this code on my Linux PC I had a surprise: on Firefox it was much faster than the first version, but on Chromium it was slower, becoming faster only for strings with lengths in the thousands. At last I found out that the problem was related to the array that distinct_primes() creates and passes to is_periodic(). The solution was to get rid of the array by merging these two functions. The code is below and the test results are on http://jsperf.com/periodic-strings-1/5

function substr_repeats(str, sublen, subcount) { see at top }  function is_periodic(str) {    var len = str.length;    var n = len;    if (n % 2 == 0) {       n /= 2;       if (substr_repeats(str, n, 2)) {          return true;       }       while (n % 2 == 0) {          n /= 2;       }    }    for (var p = 3; p * p <= n; p += 2) {       if (n % p == 0) {          if (substr_repeats(str, len / p, p)) {             return true;          }          n /= p;          while (n % p == 0) {             n /= p;          }       }    }    if (n > 1) {       if (substr_repeats(str, len / n, n)) {          return true;       }    }    return false; } 

Please remember that the timings collected by jsperf.org are absolute, and that different experimenters with different machines will contribute to different combinations of channels. You need to edit a new private version of the experiment if you want to reliably compare two JavaScript engines.

like image 177
Walter Tross Avatar answered Sep 21 '22 16:09

Walter Tross