Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to check whether two lists are circularly identical in Python

People also ask

How do you compare two lists in identical Python?

A straightforward way to check the equality of the two lists in Python is by using the equality == operator. When the equality == is used on the list type in Python, it returns True if the lists are equal and False if they are not.

How do I compare two identical lists?

Using Counter() , we usually are able to get frequency of each element in list, checking for it, for both the list, we can check if two lists are identical or not. But this method also ignores the ordering of the elements in the list and only takes into account the frequency of elements.

How do you know if two lists are equal?

equals() method. A simple solution to compare two lists of primitive types for equality is using the List. equals() method. It returns true if both lists have the same size, and all corresponding pairs of elements in both lists are equal.


First off, this can be done in O(n) in terms of the length of the list You can notice that if you will duplicate your list 2 times ([1, 2, 3]) will be [1, 2, 3, 1, 2, 3] then your new list will definitely hold all possible cyclic lists.

So all you need is to check whether the list you are searching is inside a 2 times of your starting list. In python you can achieve this in the following way (assuming that the lengths are the same).

list1 = [1, 1, 1, 0, 0]
list2 = [1, 1, 0, 0, 1]
print ' '.join(map(str, list2)) in ' '.join(map(str, list1 * 2))

Some explanation about my oneliner: list * 2 will combine a list with itself, map(str, [1, 2]) convert all numbers to string and ' '.join() will convert array ['1', '2', '111'] into a string '1 2 111'.

As pointed by some people in the comments, oneliner can potentially give some false positives, so to cover all the possible edge cases:

def isCircular(arr1, arr2):
    if len(arr1) != len(arr2):
        return False

    str1 = ' '.join(map(str, arr1))
    str2 = ' '.join(map(str, arr2))
    if len(str1) != len(str2):
        return False

    return str1 in str2 + ' ' + str2

P.S.1 when speaking about time complexity, it is worth noticing that O(n) will be achieved if substring can be found in O(n) time. It is not always so and depends on the implementation in your language (although potentially it can be done in linear time KMP for example).

P.S.2 for people who are afraid strings operation and due to this fact think that the answer is not good. What important is complexity and speed. This algorithm potentially runs in O(n) time and O(n) space which makes it much better than anything in O(n^2) domain. To see this by yourself, you can run a small benchmark (creates a random list pops the first element and appends it to the end thus creating a cyclic list. You are free to do your own manipulations)

from random import random
bigList = [int(1000 * random()) for i in xrange(10**6)]
bigList2 = bigList[:]
bigList2.append(bigList2.pop(0))

# then test how much time will it take to come up with an answer
from datetime import datetime
startTime = datetime.now()
print isCircular(bigList, bigList2)
print datetime.now() - startTime    # please fill free to use timeit, but it will give similar results

0.3 seconds on my machine. Not really long. Now try to compare this with O(n^2) solutions. While it is comparing it, you can travel from US to Australia (most probably by a cruise ship)


Not knowledgeable enough in Python to answer this in your requested language, but in C/C++, given the parameters of your question, I'd convert the zeros and ones to bits and push them onto the least significant bits of an uint64_t. This will allow you to compare all 55 bits in one fell swoop - 1 clock.

Wickedly fast, and the whole thing will fit in on-chip caches (209,880 bytes). Hardware support for shifting all 55 list members right simultaneously is available only in a CPU's registers. The same goes for comparing all 55 members simultaneously. This allows for a 1-for-1 mapping of the problem to a software solution. (and using the SIMD/SSE 256 bit registers, up to 256 members if needed) As a result the code is immediately obvious to the reader.

You might be able to implement this in Python, I just don't know it well enough to know if that's possible or what the performance might be.

After sleeping on it a few things became obvious, and all for the better.

1.) It's so easy to spin the circularly linked list using bits that Dali's very clever trick isn't necessary. Inside a 64-bit register standard bit shifting will accomplish the rotation very simply, and in an attempt to make this all more Python friendly, by using arithmetic instead of bit ops.

2.) Bit shifting can be accomplished easily using divide by 2.

3.) Checking the end of the list for 0 or 1 can be easily done by modulo 2.

4.) "Moving" a 0 to the head of the list from the tail can be done by dividing by 2. This because if the zero were actually moved it would make the 55th bit false, which it already is by doing absolutely nothing.

5.) "Moving" a 1 to the head of the list from the tail can be done by dividing by 2 and adding 18,014,398,509,481,984 - which is the value created by marking the 55th bit true and all the rest false.

6.) If a comparison of the anchor and composed uint64_t is TRUE after any given rotation, break and return TRUE.

I would convert the entire array of lists into an array of uint64_ts right up front to avoid having to do the conversion repeatedly.

After spending a few hours trying to optimize the code, studying the assembly language I was able to shave 20% off the runtime. I should add that the O/S and MSVC compiler got updated mid-day yesterday as well. For whatever reason/s, the quality of the code the C compiler produced improved dramatically after the update (11/15/2014). Run-time is now ~ 70 clocks, 17 nanoseconds to compose and compare an anchor ring with all 55 turns of a test ring and NxN of all rings against all others is done in 12.5 seconds.

This code is so tight all but 4 registers are sitting around doing nothing 99% of the time. The assembly language matches the C code almost line for line. Very easy to read and understand. A great assembly project if someone were teaching themselves that.

Hardware is Hazwell i7, MSVC 64-bit, full optimizations.

#include "stdafx.h"
#include "stdafx.h"
#include <string>
#include <memory>
#include <stdio.h>
#include <time.h>

const uint8_t  LIST_LENGTH = 55;    // uint_8 supports full witdth of SIMD and AVX2
// max left shifts is 32, so must use right shifts to create head_bit
const uint64_t head_bit = (0x8000000000000000 >> (64 - LIST_LENGTH)); 
const uint64_t CPU_FREQ = 3840000000;   // turbo-mode clock freq of my i7 chip

const uint64_t LOOP_KNT = 688275225; // 26235^2 // 1000000000;

// ----------------------------------------------------------------------------
__inline uint8_t is_circular_identical(const uint64_t anchor_ring, uint64_t test_ring)
{
    // By trial and error, try to synch 2 circular lists by holding one constant
    //   and turning the other 0 to LIST_LENGTH positions. Return compare count.

    // Return the number of tries which aligned the circularly identical rings, 
    //  where any non-zero value is treated as a bool TRUE. Return a zero/FALSE,
    //  if all tries failed to find a sequence match. 
    // If anchor_ring and test_ring are equal to start with, return one.

    for (uint8_t i = LIST_LENGTH; i;  i--)
    {
        // This function could be made bool, returning TRUE or FALSE, but
        // as a debugging tool, knowing the try_knt that got a match is nice.
        if (anchor_ring == test_ring) {  // test all 55 list members simultaneously
            return (LIST_LENGTH +1) - i;
        }

        if (test_ring % 2) {    //  ring's tail is 1 ?
            test_ring /= 2;     //  right-shift 1 bit
            // if the ring tail was 1, set head to 1 to simulate wrapping
            test_ring += head_bit;      
        }   else    {           // ring's tail must be 0
            test_ring /= 2;     // right-shift 1 bit
            // if the ring tail was 0, doing nothing leaves head a 0
        }
    }
    // if we got here, they can't be circularly identical
    return 0;
}
// ----------------------------------------------------------------------------
    int main(void)  {
        time_t start = clock();
        uint64_t anchor, test_ring, i,  milliseconds;
        uint8_t try_knt;

        anchor = 31525197391593472; // bits 55,54,53 set true, all others false
        // Anchor right-shifted LIST_LENGTH/2 represents the average search turns
        test_ring = anchor >> (1 + (LIST_LENGTH / 2)); //  117440512; 

        printf("\n\nRunning benchmarks for %llu loops.", LOOP_KNT);
        start = clock();
        for (i = LOOP_KNT; i; i--)  {
            try_knt = is_circular_identical(anchor, test_ring);
            // The shifting of test_ring below is a test fixture to prevent the 
            //  optimizer from optimizing the loop away and returning instantly
            if (i % 2) {
                test_ring /= 2;
            }   else  {
                test_ring *= 2;
            }
        }
        milliseconds = (uint64_t)(clock() - start);
        printf("\nET for is_circular_identical was %f milliseconds."
                "\n\tLast try_knt was %u for test_ring list %llu", 
                        (double)milliseconds, try_knt, test_ring);

        printf("\nConsuming %7.1f clocks per list.\n",
                (double)((milliseconds * (CPU_FREQ / 1000)) / (uint64_t)LOOP_KNT));

        getchar();
        return 0;
}

enter image description here


Reading between the lines, it sounds as though you're trying to enumerate one representative of each circular equivalence class of strings with 3 ones and 52 zeros. Let's switch from a dense representation to a sparse one (set of three numbers in range(55)). In this representation, the circular shift of s by k is given by the comprehension set((i + k) % 55 for i in s). The lexicographic minimum representative in a class always contains the position 0. Given a set of the form {0, i, j} with 0 < i < j, the other candidates for minimum in the class are {0, j - i, 55 - i} and {0, 55 - j, 55 + i - j}. Hence, we need (i, j) <= min((j - i, 55 - i), (55 - j, 55 + i - j)) for the original to be minimum. Here's some enumeration code.

def makereps():
    reps = []
    for i in range(1, 55 - 1):
        for j in range(i + 1, 55):
            if (i, j) <= min((j - i, 55 - i), (55 - j, 55 + i - j)):
                reps.append('1' + '0' * (i - 1) + '1' + '0' * (j - i - 1) + '1' + '0' * (55 - j - 1))
    return reps

Repeat the first array, then use the Z algorithm (O(n) time) to find the second array inside the first.

(Note: you don't have to physically copy the first array. You can just wrap around during matching.)

The nice thing about the Z algorithm is that it's very simple compared to KMP, BM, etc.
However, if you're feeling ambitious, you could do string matching in linear time and constant space -- strstr, for example, does this. Implementing it would be more painful, though.


Following up on Salvador Dali's very smart solution, the best way to handle it is to make sure all elements are of the same length, as well as both LISTS are of the same length.

def is_circular_equal(lst1, lst2):
    if len(lst1) != len(lst2):
        return False
    lst1, lst2 = map(str, lst1), map(str, lst2)
    len_longest_element = max(map(len, lst1))
    template = "{{:{}}}".format(len_longest_element)
    circ_lst = " ".join([template.format(el) for el in lst1]) * 2
    return " ".join([template.format(el) for el in lst2]) in circ_lst

No clue if this is faster or slower than AshwiniChaudhary's recommended regex solution in Salvador Dali's answer, which reads:

import re

def is_circular_equal(lst1, lst2):
    if len(lst2) != len(lst2):
        return False
    return bool(re.search(r"\b{}\b".format(' '.join(map(str, lst2))),
                          ' '.join(map(str, lst1)) * 2))

Given that you need to do so many comparisons might it be worth your while taking an initial pass through your lists to convert them into some sort of canonical form that can be easily compared?

Are you trying to get a set of circularly-unique lists? If so you can throw them into a set after converting to tuples.

def normalise(lst):
    # Pick the 'maximum' out of all cyclic options
    return max([lst[i:]+lst[:i] for i in range(len(lst))])

a_normalised = map(normalise,a)
a_tuples = map(tuple,a_normalised)
a_unique = set(a_tuples)

Apologies to David Eisenstat for not spotting his v.similar answer.


You can roll one list like this:

list1, list2 = [0,1,1,1,0,0,1,0], [1,0,0,1,0,0,1,1]

str_list1="".join(map(str,list1))
str_list2="".join(map(str,list2))

def rotate(string_to_rotate, result=[]):
    result.append(string_to_rotate)
    for i in xrange(1,len(string_to_rotate)):
        result.append(result[-1][1:]+result[-1][0])
    return result

for x in rotate(str_list1):
    if cmp(x,str_list2)==0:
        print "lists are rotationally identical"
        break