Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When strings are equivalent up to rotation

I have a large number of strings. For my purposes, two strings are equivalent if one is a rotation of the other (e.g. '1234' is equivalent to '3412').

What is an efficient way to process every string exactly once (up to rotation) in Python?

A naive implementation of what I want might look like:

class DuplicateException(Exception): pass
seen = set()
for s in my_strings:
  try:
    s2 = s+s
    for t in seen:

      # Slick method I picked up here in SO
      # for checking whether one string is
      # a rotation of another
      if len(s) == len(t) and t in s2:
        raise DuplicateException()

    seen.add(s)
    process(s)
  except DuplicateException: pass
like image 463
math4tots Avatar asked Mar 03 '13 03:03

math4tots


2 Answers

Pick a canonical way to represent a class of rotated strings (e.g. the lexicographically least rotation among all possible rotations of a string), and work only with canonical representations (canonicalization).

For example:

def canonicalize(s):
    return min(s[i:]+s[:i] for i in xrange(len(s)))

canonical_strings = {canonicalize(s) for s in my_strings}
for cs in canonical_strings:
    process(cs)
like image 67
nneonneo Avatar answered Nov 14 '22 23:11

nneonneo


Maybe it makes sense to rotate your string to a specific value, for example smallest possible rotation, than those smallest rotations are unique, and could be easily put in a set.

Here is a an example implementation, and "rotate_to_smallest" can probably be improved.

my_strings = ['1234', '123', '2341', '4312', '312', '56', '65', '1236']

def rotate_to_smallest(x):
    smallest = x
    for i in xrange(1, len(x)):
        rotation = x[i :] + x[: i]
        if rotation < smallest:
            smallest = rotation
    return smallest

def unique_rotations(my_strings):
    uniques = set(())
    for s in my_strings:
        smallest_rotation = rotate_to_smallest(s)
        if smallest_rotation not in uniques:
            uniques.add(smallest_rotation)
    return uniques

Result:

>>> unique_rotations(my_strings)
set(['1234', '56', '1243', '123', '1236'])
like image 37
Akavall Avatar answered Nov 14 '22 23:11

Akavall