Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to make a random but partial shuffle in Python?

Instead of a complete shuffle, I am looking for a partial shuffle function in python.

Example : "string" must give rise to "stnrig", but not "nrsgit"

It would be better if I can define a specific "percentage" of characters that have to be rearranged.

Purpose is to test string comparison algorithms. I want to determine the "percentage of shuffle" beyond which an(my) algorithm will mark two (shuffled) strings as completely different.

Update :

Here is my code. Improvements are welcome !

import random

percent_to_shuffle = int(raw_input("Give the percent value to shuffle : "))
to_shuffle = list(raw_input("Give the string to be shuffled : "))

num_of_chars_to_shuffle = int((len(to_shuffle)*percent_to_shuffle)/100)

for i in range(0,num_of_chars_to_shuffle):
    x=random.randint(0,(len(to_shuffle)-1))
    y=random.randint(0,(len(to_shuffle)-1))
    z=to_shuffle[x]
    to_shuffle[x]=to_shuffle[y]
    to_shuffle[y]=z

print ''.join(to_shuffle)
like image 819
384X21 Avatar asked Dec 01 '11 11:12

384X21


2 Answers

This is a problem simpler than it looks. And the language has the right tools not to stay between you and the idea,as usual:

import random

def pashuffle(string, perc=10):
    data = list(string)
    for index, letter in enumerate(data):
        if random.randrange(0, 100) < perc/2:
            new_index = random.randrange(0, len(data))
            data[index], data[new_index] = data[new_index], data[index]
    return "".join(data)
like image 158
jsbueno Avatar answered Oct 15 '22 11:10

jsbueno


Your problem is tricky, because there are some edge cases to think about:

  • Strings with repeated characters (i.e. how would you shuffle "aaaab"?)
  • How do you measure chained character swaps or re arranging blocks?

In any case, the metric defined to shuffle strings up to a certain percentage is likely to be the same you are using in your algorithm to see how close they are.

My code to shuffle n characters:

import random
def shuffle_n(s, n):
  idx = range(len(s))
  random.shuffle(idx)
  idx = idx[:n]
  mapping = dict((idx[i], idx[i-1]) for i in range(n))
  return ''.join(s[mapping.get(x,x)] for x in range(len(s)))

Basically chooses n positions to swap at random, and then exchanges each of them with the next in the list... This way it ensures that no inverse swaps are generated and exactly n characters are swapped (if there are characters repeated, bad luck).

Explained run with 'string', 3 as input:

idx is [0, 1, 2, 3, 4, 5]
we shuffle it, now it is [5, 3, 1, 4, 0, 2]
we take just the first 3 elements, now it is [5, 3, 1]
those are the characters that we are going to swap
s t r i n g
  ^   ^   ^
t (1) will be i (3)
i (3) will be g (5)
g (5) will be t (1)
the rest will remain unchanged
so we get 'sirgnt'

The bad thing about this method is that it does not generate all the possible variations, for example, it could not make 'gnrits' from 'string'. This could be fixed by making partitions of the indices to be shuffled, like this:

import random

def randparts(l):
    n = len(l)
    s = random.randint(0, n-1) + 1
    if s >= 2 and n - s >= 2: # the split makes two valid parts
        yield l[:s]
        for p in randparts(l[s:]):
            yield p
    else: # the split would make a single cycle
        yield l

def shuffle_n(s, n):
    idx = range(len(s))
    random.shuffle(idx)
    mapping = dict((x[i], x[i-1])
        for i in range(len(x))
        for x in randparts(idx[:n]))
    return ''.join(s[mapping.get(x,x)] for x in range(len(s)))
like image 28
fortran Avatar answered Oct 15 '22 11:10

fortran