Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding all subsets of a multiset

Tags:

algorithm

Suppose I have a bag which contains 6 balls (3 white and 3 black). I want to find all possible subsets of a given length, disregarding the order. In the case above, there are only 4 combinations of 3 balls I can draw from the bag:

  • 2 white and 1 black
  • 2 black and 1 white
  • 3 white
  • 3 black

I already found a library in my language of choice that does exactly this, but I find it slow for greater numbers. For example, with a bag containing 15 white, 1 black, 1 blue, 1 red, 1 yellow and 1 green, there are only 32 combinations of 10 balls, but it takes 30 seconds to yield the result.

Is there an efficient algorithm which can find all those combinations that I could implement myself? Maybe this problem is not as trivial as I first thought...

Note: I'm not even sure of the right technic words to express this, so feel free to correct the title of my post.

like image 777
Stamm Avatar asked Oct 03 '13 17:10

Stamm


1 Answers

You can do significantly better than a general choose algorithm. The key insight is to treat each color of balls at the same time, rather than each of those balls one by one.

I created an un-optimized implementation of this algorithm in python that correctly finds the 32 result of your test case in milliseconds:

def multiset_choose(items_multiset, choose_items):
    if choose_items == 0:
        return 1 # always one way to choose zero items
    elif choose_items < 0:
        return 0 # always no ways to choose less than zero items
    elif not items_multiset:
        return 0 # always no ways to choose some items from a set of no items
    elif choose_items > sum(item[1] for item in items_multiset):
        return 0 # always no ways to choose more items than are in the multiset

    current_item_name, current_item_number = items_multiset[0]
    max_current_items = min([choose_items, current_item_number])

    return sum(
        multiset_choose(items_multiset[1:], choose_items - c)
        for c in range(0, max_current_items + 1)
    )

And the tests:

print multiset_choose([("white", 3), ("black", 3)], 3)
# output: 4

print multiset_choose([("white", 15), ("black", 1), ("blue", 1), ("red", 1), ("yellow", 1), ("green", 1)], 10)
# output: 32
like image 154
recursive Avatar answered Nov 10 '22 12:11

recursive