Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Randomly select subset of all combination in Python

I can construct a list of all combinations of n-length binary values itertools.list(product([0, 1], repeat=n)) when n is small.

1000
0100
0110
1001
 .
 .
 .

How can I randomly select a subset of the list above without first constructing a massive combination list when n is big?

Let say I want to randomly pick 1 million combinations without replacement when n = 30 (2^30 combinations in total)

I looked at an extended function from itertools http://docs.python.org/2/library/itertools.html#recipes

def random_product(*args, **kwds):
    "Random selection from itertools.product(*args, **kwds)"
    pools = map(tuple, args) * kwds.get('repeat', 1)
    return tuple(random.choice(pool) for pool in pools)

but it only returns once at a time. Should I do a loop of this function until I get 1 million of unique combinations? or there is a better way. Thanks!

like image 687
bliu Avatar asked Mar 07 '23 04:03

bliu


1 Answers

You could think about the problem in another way. Essentially you just want 1 million random values between 0 and 2^30.

import random

num_selections = 1000000
range = 2 ** 30

def make_set(n, max):
  result = set()
  while(len(result) < n):
    rand = bin(random.randrange(max)) # converting to binary
    result.add(rand)
  return result

s = make_set(num_selections, range)

This runs in about 2 seconds on my machine. This method would not be efficient if n is roughly equal to max. But 1000000 / (2^30) ~= 0.000931, so it works fine.

Edit:

@user2285236's solution is more concise:

import random
random_group = random.sample(range(2**30), 10**6)
random_group = [bin(x) for x in random_group] # convert all to binary
like image 50
Aaron Krajeski Avatar answered Mar 16 '23 21:03

Aaron Krajeski