Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterate over all lists with one duplicate

Tags:

python

Consider all lists of length n where the values are integers from the range 0 to n-1. I would like to iterate as quickly as possible over every possible such list that has exactly one duplicate. If n = 2 then the possible lists are:

00
11

If n = 3 they are:

001
002
110
112
220
221
100
200
011
211
022
122
010
020
101
121
202
212

The total number of such lists is n! * (n choose 2) so I would like to avoid storing them all in memory if at all possible.

For each list I want to call a function that calculates a function of the list.
What's a good way to do this?

Benchmarks On my computer I get the following benchmark results

timeit all(one_duplicate(6))
100 loops, best of 3: 6.87 ms per loops
timeit all(one_duplicate(7))
10 loops, best of 3: 59 ms per loop
timeit all(one_duplicate(8))
1 loops, best of 3: 690 ms per loop
timeit all(one_duplicate(9))
1 loops, best of 3: 7.58 s per loop

and

timeit all(duplicates(range(6)))
10 loops, best of 3: 22.8 ms per loop
timeit all(duplicates(range(7)))
1 loops, best of 3: 416 ms per loop    
timeit all(duplicates(range(8)))
1 loops, best of 3: 9.78 s per loop

and

timeit all(all_doublet_tuples(6))
100 loops, best of 3: 6.36 ms per loop
timeit all(all_doublet_tuples(7))
10 loops, best of 3: 60 ms per loop
timeit all(all_doublet_tuples(8))
1 loops, best of 3: 542 ms per loop
timeit all(all_doublet_tuples(9))
1 loops, best of 3: 7.27 s per loop

I declare all_doublet_tuples and one_duplicate to be equal first at the moment (allowing for noise in the tests).

like image 230
Majid Avatar asked Dec 15 '22 14:12

Majid


2 Answers

Think of any of the desired output tuples in this way: It was derived by duplicating the first (lets say from the left) doublet elem and inserting it at the position of the second doublet.

Therefore my solution is basically this two-step process:

  1. generate all n-1 permutations of range(n)
  2. for each tuple from 1.:
    take each single element and insert it at each position ahead (to avoid duplicates) including at the very end

import itertools as it

# add_doublet accomplishes step 2
def add_doublet(t):
    for i in range(len(t)):
        for j in range(i+1, len(t)+1):
            yield t[:j] + (t[i],) + t[j:]


def all_doublet_tuples(n):
    for unique_tuple in it.permutations(range(n), n-1):
        for doublet_tuple in add_doublet(unique_tuple):
            yield doublet_tuple



from pprint import pprint

n = 3
pprint(list(all_doublet_tuples(n)))

I do not print the output here cause its kind of lenghty ... and the case for n=3 is boring.

like image 70
tzelleke Avatar answered Dec 21 '22 09:12

tzelleke


import itertools

def one_duplicate(n):
    nrange = list(range(n))
    for x in nrange:
        without_x = nrange[:x] + nrange[x+1:]
        for i, j in itertools.combinations(nrange, 2):
            for others in map(list, itertools.permutations(without_x, n-2)):
                others.insert(i, x)
                others.insert(j, x)
                yield others

>>> list(one_duplicate(2))
[[0, 0], [1, 1]]
>>> list(one_duplicate(3))
[[0, 0, 1], [0, 0, 2], [0, 1, 0], [0, 2, 0], [1, 0, 0], [2, 0, 0], [1, 1, 0], [1, 1, 2], [1, 0, 1], [1, 2, 1], [0, 1, 1], [2, 1, 1], [2, 2, 0], [2, 2, 1], [2, 0, 2], [2, 1, 2], [0, 2, 2], [1, 2, 2]]
>>> list(one_duplicate(4))
[[0, 0, 1, 2], [0, 0, 1, 3], [0, 0, 2, 1], [0, 0, 2, 3], [0, 0, 3, 1], [0, 0, 3, 2], [0, 1, 0, 2], [0, 1, 0, 3], [0, 2, 0, 1], [0, 2, 0, 3], [0, 3, 0, 1], [0, 3, 0, 2], [0, 1, 2, 0], [0, 1, 3, 0], [0, 2, 1, 0], [0, 2, 3, 0], [0, 3, 1, 0], [0, 3, 2, 0], [1, 0, 0, 2], [1, 0, 0, 3], [2, 0, 0, 1], [2, 0, 0, 3], [3, 0, 0, 1], [3, 0, 0, 2], [1, 0, 2, 0], [1, 0, 3, 0], [2, 0, 1, 0], [2, 0, 3, 0], [3, 0, 1, 0], [3, 0, 2, 0], [1, 2, 0, 0], [1, 3, 0, 0], [2, 1, 0, 0], [2, 3, 0, 0], [3, 1, 0, 0], [3, 2, 0, 0], [1, 1, 0, 2], [1, 1, 0, 3], [1, 1, 2, 0], [1, 1, 2, 3], [1, 1, 3, 0], [1, 1, 3, 2], [1, 0, 1, 2], [1, 0, 1, 3], [1, 2, 1, 0], [1, 2, 1, 3], [1, 3, 1, 0], [1, 3, 1, 2], [1, 0, 2, 1], [1, 0, 3, 1], [1, 2, 0, 1], [1, 2, 3, 1], [1, 3, 0, 1], [1, 3, 2, 1], [0, 1, 1, 2], [0, 1, 1, 3], [2, 1, 1, 0], [2, 1, 1, 3], [3, 1, 1, 0], [3, 1, 1, 2], [0, 1, 2, 1], [0, 1, 3, 1], [2, 1, 0, 1], [2, 1, 3, 1], [3, 1, 0, 1], [3, 1, 2, 1], [0, 2, 1, 1], [0, 3, 1, 1], [2, 0, 1, 1], [2, 3, 1, 1], [3, 0, 1, 1], [3, 2, 1, 1], [2, 2, 0, 1], [2, 2, 0, 3], [2, 2, 1, 0], [2, 2, 1, 3], [2, 2, 3, 0], [2, 2, 3, 1], [2, 0, 2, 1], [2, 0, 2, 3], [2, 1, 2, 0], [2, 1, 2, 3], [2, 3, 2, 0], [2, 3, 2, 1], [2, 0, 1, 2], [2, 0, 3, 2], [2, 1, 0, 2], [2, 1, 3, 2], [2, 3, 0, 2], [2, 3, 1, 2], [0, 2, 2, 1], [0, 2, 2, 3], [1, 2, 2, 0], [1, 2, 2, 3], [3, 2, 2, 0], [3, 2, 2, 1], [0, 2, 1, 2], [0, 2, 3, 2], [1, 2, 0, 2], [1, 2, 3, 2], [3, 2, 0, 2], [3, 2, 1, 2], [0, 1, 2, 2], [0, 3, 2, 2], [1, 0, 2, 2], [1, 3, 2, 2], [3, 0, 2, 2], [3, 1, 2, 2], [3, 3, 0, 1], [3, 3, 0, 2], [3, 3, 1, 0], [3, 3, 1, 2], [3, 3, 2, 0], [3, 3, 2, 1], [3, 0, 3, 1], [3, 0, 3, 2], [3, 1, 3, 0], [3, 1, 3, 2], [3, 2, 3, 0], [3, 2, 3, 1], [3, 0, 1, 3], [3, 0, 2, 3], [3, 1, 0, 3], [3, 1, 2, 3], [3, 2, 0, 3], [3, 2, 1, 3], [0, 3, 3, 1], [0, 3, 3, 2], [1, 3, 3, 0], [1, 3, 3, 2], [2, 3, 3, 0], [2, 3, 3, 1], [0, 3, 1, 3], [0, 3, 2, 3], [1, 3, 0, 3], [1, 3, 2, 3], [2, 3, 0, 3], [2, 3, 1, 3], [0, 1, 3, 3], [0, 2, 3, 3], [1, 0, 3, 3], [1, 2, 3, 3], [2, 0, 3, 3], [2, 1, 3, 3]]

Here is a description of the algorithm:

  • Find all the index pairs using itertools.combinations(nrange, 2)
  • For each number x in the range, get all permutations of length n-2 for the entire range except for x
  • In each of these permutations, insert x at each of the indices found in the first step and yield the result.

Timing comparisons between my answer and Peter Stahl's:

# Just showing they both get the same result
In [23]: set(peter(range(6))) == set(tuple(x) for x in fj(6))
Out[23]: True

In [24]: %timeit list(peter(range(6)))
10 loops, best of 3: 27.3 ms per loop

In [25]: %timeit list(fj(6))
100 loops, best of 3: 8.32 ms per loop

In [26]: %timeit list(peter(range(7)))
1 loops, best of 3: 506 ms per loop

In [27]: %timeit list(fj(7))
10 loops, best of 3: 81 ms per loop
like image 27
Andrew Clark Avatar answered Dec 21 '22 11:12

Andrew Clark