Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shuffling combinations without converting iterable (itertools.combinations) to list

The following simple code gives me the possible combinations of length 3 of 200 elements.

from itertools import combinations
comb = combinations( range(200), 3 )

I want to get the combinations in a random order in order to pick the first N combinations. However, if I convert comb to a list and shuffle it as following, I may get a memory error because the list might contain too many elements:

comb = list(comb) # This might be huge and give a memory error 
random.shuffle(comb)
N = 10
comb = comb[:10] # get only the first N random combinations

Is there any other way to get N random combinations ? (i.e., not in the order generated by itertools.combinations).

like image 466
eLearner Avatar asked Apr 21 '16 20:04

eLearner


1 Answers

There are C(200, 3) = 1313400 possible combinations. As you also mentioned, this number can easily get out of hand due to the combinatorial explosion. For example, if you choose 4 instead of 3 elements, the number of combinations will be approximately 50 times larger (64684950). Instead of randomly selecting from these combinations, you can randomly build possible combinations.

To build those combinations, you can use random.sample from the random library. random.sample(range(200), 3) will randomly generate one of those 1313400 combinations. If you call it again, it will generate another combination.

There are two issues:

  1. Order is important in random.sample ([1, 2, 3] is different than [1, 3, 2]). In combinations, it is not. To solve that, you can use sorted().
  2. random.sample will independently generate the next 3 numbers. Therefore, the combinations generated in different iterations could be the same. Although it is very unlikely for this example (≈0.0000343), you can use a set to store the combinations so that only unique combinations will be stored.

The following will generate 10 different combinations:

import random
combs = set()
N = 10
while len(combs) < N:
    combs.add(tuple(sorted(random.sample(range(200), 3))))
like image 137
ayhan Avatar answered Nov 03 '22 22:11

ayhan