Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to generalize my algorithm to detect if one string is a rotation of another

So I've been going through various problems to review for upcoming interviews and one I encountered is determining whether two strings are rotations of each other. Obviously, I'm hardly the first person to solve this problem. In fact, I did discover that my idea for solving this seems similar to the approach taken in this question.

Full disclosure: I do have a related question on Math SE that's focused on the properties from a more mathematical perspective (although it's worth noting that the way that I tried to formulate the ideas behind this there end up being incorrect for reasons that are explained there).

Here's the idea (and this is similar to the approach taken in the linked question): suppose you have a string abcd and the rotation cdab. Clearly, both cd and ab are substrings of cdab, but if you concatenate them together you get abcd.

So basically, a rotation simply entails moving a substring from the end of the string to the beginning (e.g. we constructed cdab from abcd by moving cd from the end of the string to the beginning of the string).

I came up with an approach that works in a very restricted case (if both of the substrings consist of consecutive letters, like they do in the example there), but it fails otherwise (and I give an example of passing and failing cases and inputs/outputs below the code). I'm trying to figure out if it's possible (or even worthwhile) to try to fix it to work in the general case.

    public bool AreRotations(string a, string b)
    {
        if (a == null)
            throw new ArgumentNullException("a");
        else if (b == null)
            throw new ArgumentNullException("b");
        else if (a.Trim().Length == 0)
            throw new ArgumentException("a is empty or consists only of whitespace");
        else if (b.Trim().Length == 0)
            throw new ArgumentException("b is empty or consists only of whitespace");

        // Obviously, if the strings are of different lengths, they can't possibly be rotations of each other
        if (a.Length != b.Length)
            return false;

        int[] rotationLengths = new int[a.Length];

        /* For rotations of length -2, -2, -2, 2, 2, 2, the distinct rotation lengths are -2, 2
         * 
         * In the example I give below of a non-working input, this contains -16, -23, 16, 23
         * 
         * On the face of it, that would seem like a useful pattern, but it seems like this
         * could quickly get out of hand as I discover more edge cases
         */
        List<int> distinctRotationLengths = new List<int>();
        for (int i = 0; i < a.Length; i++)
        {
            rotationLengths[i] = a[i] - b[i];

            if (i == 0)
                distinctRotationLengths.Add(rotationLengths[0]);
            else if (rotationLengths[i] != rotationLengths[i - 1])
            {
                distinctRotationLengths.Add(rotationLengths[i]);
            }
        }

        return distinctRotationLengths.Count == 2;
    }

And now for the sample inputs/outputs:

        StringIsRotation rot = new StringIsRotation();

        // This is the case that doesn't work right - it gives "false" instead of "true"
        bool success = rot.AreRotations("acqz", "qzac");

        // True
        success = rot.AreRotations("abcdef", "cdefab");

        // True
        success = rot.AreRotations("ablm", "lmab");

        // False, but should be true - this is another illustration of the bug
        success = rot.AreRotations("baby", "byba");

        // True
        success = rot.AreRotations("abcdef", "defabc");

        //True
        success = rot.AreRotations("abcd", "cdab");

        // True
        success = rot.AreRotations("abc", "cab");

        // False
        success = rot.AreRotations("abcd", "acbd");

        // This is an odd situation - right now it returns "false" but you could
        // argue about whether that's correct
        success = rot.AreRotations("abcd", "abcd");

Is it possible/worthwhile to salvage this approach and have it still be O(n), or should I just go with one of the approaches described in the post I linked to? (Note that this isn't actually production code or homework, it's purely for my own learning).

Edit: For further clarification based on the comments, there are actually two questions here - first, is this algorithm fixable? Secondly, is it even worth fixing it (or should I just try another approach like one described in the answers or the other question I linked to)? I thought of a few potential fixes but they all involved either inelegant special-case reasoning or making this algorithm O(n^2), both of which would kill the point of the algorithm in the first place.

like image 849
EJoshuaS - Stand with Ukraine Avatar asked Dec 28 '16 22:12

EJoshuaS - Stand with Ukraine


People also ask

How do you check if two strings are mutually rotating?

Suppose two strings are S1 = 'HELLO', and S2 = 'LOHEL' So they are rotation of each other. By rotating HELLO three position to the left it will be LOHEL. To solve this problem, we will concatenate the first string with itself, then check whether the second one is present in the concatenated string or not.

How can you check if one String is a rotation of another String in Java?

temp = str1. str1 2. If str2 is a substring of temp then str1 and str2 are rotations of each other. Example: str1 = "ABACD" str2 = "CDABA" temp = str1.

Are strings rotating?

A String is said to be a rotation of another String, if it has the same length, contains the same characters, and they were rotated around one of the characters. For example, String"bcda" is a rotation of "abcd" but "bdca" is not a rotation of String "abcd".


1 Answers

Let suppose the first string is S and the second is S', clearly if they have different length then we output they are not a rotation of each other. Create a string S''=SS. In fact concatenation of S to itself. Then if S,S' are rotation of each other we find a substring S' in S'' by KMP Algorithm in O(n), otherwise we output they are not a rotation of each other. BTW if you are looking for a fast practical algorithm then instead of KMP use Boyer Moore algorithm.

To address the question more explicit, I'd say that I don't expect an easy algorithm for this special case of string matching problem. So having this background in mind, I don't think an easy modification on your algorithm can work. In fact the field of string matching algorithms is very well developed. If there is a somewhat simpler algorithm than sth like KMP or suffix tree based algorithms, for this special case, then still I think studying those general algorithms can help.

like image 155
Saeed Amiri Avatar answered Oct 05 '22 22:10

Saeed Amiri