Generating a list of repetitions regardless of the order

Question

I want to generate combinations that associate indices in a list with "slots". For instance,(0, 0, 1) means that 0 and 1 belong to the same slot while 2 belongs to an other. (0, 1, 1, 1) means that 1, 2, 3 belong to the same slot while 0 is by itself. In this example, 0 and 1 are just ways of identifying these slots but do not carry information for my usage.

Consequently, (0, 0, 0) is absolutely identical to (1, 1, 1) for my purposes, and (0, 0, 1) is equivalent to (1, 1, 0).

The classical cartesian product generates a lot of these repetitions I'd like to get rid of.

This is what I obtain with itertools.product :

>>> LEN, SIZE = (3,1)
>>> list(itertools.product(range(SIZE+1), repeat=LEN))
>>>
[(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1),
(1, 0, 0),
(1, 0, 1),
(1, 1, 0),
(1, 1, 1)]

And this is what I'd like to get:

>>> [(0, 0, 0),
(0, 0, 1),
(0, 1, 0),
(0, 1, 1)]

It is easy with small lists but I don't quite see how to do this with bigger sets. Do you have a suggestion?

If it's unclear, please tell me so that I can clarify my question. Thank you!

Edit: based on Sneftel's answer, this function seems to work, but I don't know if it actually yields all the results:

def test():
    for p in product(range(2), repeat=3):
        j=-1
        good = True
        for k in p:
            if k> j and (k-j) > 1:
                good = False
            elif k >j:
                j = k
        if good:
            yield p

Gareth Rees · Accepted Answer

I would start by making the following observations:

The first element of each combination must be 0.
The second element must be 0 or 1.
The third element must be 0, 1 or 2, but it can only be 2 if the second element was 1.

These observations suggest the following algorithm:

def assignments(n, m, used=0):
    """Generate assignments of `n` items to `m` indistinguishable
    buckets, where `used` buckets have been used so far.

        >>> list(assignments(3, 1))
        [(0, 0, 0)]
        >>> list(assignments(3, 2))
        [(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1)]
        >>> list(assignments(3, 3))
        [(0, 0, 0), (0, 0, 1), (0, 1, 0), (0, 1, 1), (0, 1, 2)]

    """
    if n == 0:
        yield ()
        return
    aa = list(assignments(n - 1, m, used))
    for first in range(used):
        for a in aa:
            yield (first,) + a
    if used < m:
        for a in assignments(n - 1, m, used + 1):
            yield (used,) + a

This handles your use case (12 items, 5 buckets) in a few seconds:

>>> from timeit import timeit
>>> timeit(lambda:list(assignments(12, 5)), number=1)
4.513746023178101
>>> sum(1 for _ in assignments(12, 5))
2079475

This is substantially faster than the function you give at the end of your answer (the one that calls product and then drops the invalid assignments) would be if it were modified to handle the (12, 5) use case:

>>> timeit(lambda:list(test(12, 5)), number=1)
540.693009853363

fuesika · Answer

Before checking for duplicates, you should harmonize the notation (assuming you don't want to set up some fancy AI): iterate through the lists and assign set-affiliation numbers for differing elements starting at 0, counting upwards. That is, you create a temporary dictionary per line that you are processing.

An exemplary output would be

(0,0,0) -> (0,0,0)
(0,1,0) -> (0,1,0)

but

(1,0,1) -> (0,1,0)

Removing the duplicates can then easily be performed as the problem is reduced to the problem of the solved question at Python : How to remove duplicate lists in a list of list?

Generating a list of repetitions regardless of the order

Tags:

python

algorithm

permutation

combinatorics

user3057865

2 Answers

Gareth Rees

fuesika

Recent Activity

Donate For Us

Generating a list of repetitions regardless of the order

Tags:

python

algorithm

permutation

combinatorics

user3057865

2 Answers

Gareth Rees

fuesika

Related questions

Recent Activity

Donate For Us