Is there any efficient algorithms to check whether a binary string is periodic or not?
Let S be a binary string and H be the set of sub strings of S. Then S is said to be periodic if it can be obtained by concatenating one or more times, at least one h in H, also h != S .
A binary string is a sequence of bytes. Unlike a character string which usually contains text data, a binary string is used to hold non-traditional data such as pictures. The length of a binary string is the number of bytes in the sequence. A binary string has a CCSID of 65535.
Here are some examples: The empty string ε is a binary string. ε is in both Σ* and Σ**. 0 and 1 are finite binary strings, and are in both Σ* and Σ**, as are all finite binary strings.
Definition of a binary period : The period of this string is the smallest positive integer P such that: P ≤ Q / 2 and S[K] = S[K+P] for 0 ≤ K < Q − P. For example, 7 is the period of “abracadabracadabra”.
A binary string is a sequence of octets (or bytes). Binary strings are distinguished from characters strings by two characteristics: First, binary strings specifically allow storing octets of zero value and other "non-printable" octets (defined as octets outside the range 32 to 126).
Initial string S with length Len. Double the string (really we need S + half of S). Search for occurrence of initial string S in doubled string SS, beginning from 2nd position, ending at Len/2 + 1. If such occurence exists with position P, then S is periodic with period P - 1.
S = abbaabbaabba Len = 12 SS = abbaabbaabbaabbaabbaabba
Searching from 2nd to 7th position, S found at P=5, period = 4
S = abaabaabaabb SS = abaabaabaabbabaabaabaabb
S doesn't occur in SS (except for 1 and L+1), no period exists
P.S. Note due to useful Venkatesh comment: We need to add minimal possible periodic unit, it's length is L/2 for even-size strings, and maximum divisor of L for odd-size strings (if length is prime number, string cannot be periodic). Simple factorization methods have O(N^1/2) complexity, more complex - O(N^1/4), so sometimes it is worth to factorize length to avoid unnecessary comparison of long strings.
I'm pretty sure it is possible to improve on this, but I would have started by breaking the length of S (I'll call that L) to prime factors, and checking for a period of length of S/f for each prime factor f (len(h) must divide len(S) and I'm since not looking for the shortest possible h, prime L/len(h) is enough).
As to improvements, random check order would help in some circumstances (to prevent constructing input for worse case scenarios, etc.).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With