Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient combinations with replacement for multiple iterables, or order-independent product

I'm trying to find a performant solution in Python that works like so:

>>> func([1,2,3], [1,2])
[(1,1), (1,2), (1,3), (2,2), (2,3)]

This is similar to itertools.combinations_with_replacement, except that it can take multiple iterables. It's also similar to itertools.product, except that it omits order-independent duplicate results.

All of the inputs will be prefixes of the same series (i.e. they all start with the same element and follow the same pattern, but might have different lengths).

The function must be able to take any number of iterables as input.

Given a set of lists A, B, C, ..., here is a sketch of an algorithm that generates those results.

assert len(A) <= len(B) <= len(C) <= ...
for i in 0..len(A)
  for j in i..len(B)
    for k in j..len(C)
        .
        .
        .
      yield A[i], B[j], C[k], ...

Things I can't do

  • Use itertools.product and filter the results. This has to be performant.
  • Use recursion. The function overhead would make it slower than using itertools.product and filtering for a reasonable number of iterables.

I suspect there's a way to do this with itertools, but I have no idea what it is.

EDIT: I'm looking for the solution that takes the least time.

EDIT 2: There seems to be some confusion about what I'm trying to optimize. I'll illustrate with an example.

>>> len(list(itertools.product( *[range(8)] * 5 )))
32768
>>> len(list(itertools.combinations_with_replacement(range(8), 5)))
792

The first line gives the number of order-dependent possibilities for rolling 5 8-sided dice. The second gives the number of order-independent possibilities. Regardless of how performant itertools.product is, it'll take 2 orders of magnitude more iterations to get a result than itertools.combinations_with_replacement. I'm trying to find a way to do something similar to itertools.combinations_with_replacement, but with multiple iterables that minimizes the number of iterations, or time performance. (product runs in math whereas combinations_with_replacement runs in math, where M is the number of sides on the die and N is the number of dice)

like image 658
whereswalden Avatar asked Nov 29 '25 16:11

whereswalden


1 Answers

This solution hasn't recursion or filtering. It's trying to produce only ascending sequences of indices so it's usable only for prefixes of same collection. Also it's uses only indices for element identification so it's not enforces elements of series to be comparable or even hashable.

def prefixCombinations(coll,prefixes):
    "produces combinations of elements of the same collection prefixes"
    prefixes = sorted(prefixes) # does not impact result through it's unordered combinations
    n = len(prefixes)
    indices = [0]*n
    while True:            
        yield tuple(coll[indices[i]] for i in range(n))
        #searching backwards for non-maximum index
        for i in range(n-1,-1,-1):
            if indices[i] < prefixes[i] - 1 : break
        # if all indices hits maximum - leave
        else: break
        level = indices[i] + 1
        for i in range(i,n): indices[i] = level

examples are

>>> list(prefixCombinations([1,2,3,4,5], (3,2)))
[[1, 1], [1, 2], [1, 3], [2, 2], [2, 3]]

>>> list(prefixCombinations([1,2,3,4,5], (3,2,5)))
[[1, 1, 1], [1, 1, 2], [1, 1, 3], [1, 1, 4], [1, 1, 5], [1, 2, 2], [1, 2, 3], [1, 2, 4], [1, 2, 5], [1, 3, 3], [1, 3, 4], [1, 3, 5], [2, 2, 2], [2, 2, 3], [2, 2, 4], [2, 2, 5], [2, 3, 3], [2, 3, 4], [2, 3, 5]]

>>> from itertools import combinations_with_replacement
>>> tuple(prefixCombinations(range(10),[10]*4)) == tuple(combinations_with_replacement(range(10),4))
True
like image 193
Odomontois Avatar answered Dec 01 '25 06:12

Odomontois



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!