How to generate all set combinations in a random order

Tags:

First off, I'm not even sure the terminology is the right one, as I havent found anything similar (especially since I dont even know what keywords to use)

The problem: There is a population of people, and I want to assign them into groups. I have a set of rules to give each assignation a score. I want to find the best one (or at least a very good one).

For example, with a population of four {A,B,C,D} and assigning to two groups of two, the possible assignations are:

{A,B},{C,D}

{A,C},{B,D}

{A,D},{B,C}

And for example, {B,A},{C,D} and {C,D},{A,B} are both the same as the first one (I don't care about the order inside the groups and the order of the groups themselves).

The number of people, the amount of groups and how many people fit in each group are all inputs.

My idea was to list each possible assignation, calculate their score and keep track of the best one. That is, to brute force it. As the population can be big, I was thinking of going through them in a random order and return the best one found when time runs out (probably when the user gets bored or thinks it is a good enough find). The population can vary from very small (the four listed) to really big (maybe 200+) so just trying random ones without caring about repeats breaks down with the small ones, where a brute force is possible (plus I wouldn't know when to stop if I used plain random permutations).

The population is big enough that listing all the assignations to be able to shuffle them doesn't fit into memory. So I need either a method to find all the possible assignations in a random order, or a method to, given an index, generate the corresponding assignation, and use an index array and shuffle that (the second would be better because I can then easily distribute the tasks into multiple servers).

338

asked Aug 30 '16 23:08

Daniferrito

3 Answers

A simple recursive algorithm for generating these pairings is to pair the first element with each of the remaining elements, and for each of those couplings, recursively generate all the pairings of the remaining elements. For groups, generate all the groups made up of the first element and all the combinations of the remaining elements, then recurse for the remainders.

You can compute how many possible sets of groups there are like this:

public static int numGroupingCombinations(int n, int groupSize)
{
    if(n % groupSize != 0)
        return 0;   // n must be a multiple of groupSize

    int count = 1;
    while(n > groupSize)
    {
        count *= nCr(n - 1, groupSize - 1);
        n -= groupSize;
    }
    return count;
}

public static int nCr(int n, int r)
{
    int ret = 1;
    for (int k = 0; k < r; k++) {
        ret = ret * (n-k) / (k+1);
    }
    return ret; 
}

So I need either a method to find all the possible assignations in a random order, or a method to, given an index, generate the corresponding assignation, and use an index array and shuffle that (the second would be better because I can then easily distribute the tasks into multiple servers).

To generate a grouping from an index, choose a combination of items to group with the first element by taking the modulo of the index with the number of possible combinations, and generating the combination from the result using this algorithm. Then divide the index by that same number and recursively generate the rest of the set.

public static void generateGrouping(String[] elements, int groupSize, int start, int index)
{
    if(elements.length % groupSize != 0)
        return;

    int remainingSize = elements.length - start;
    if(remainingSize == 0)
    {
        // output the elements:
        for(int i = 0; i < elements.length; i += groupSize)
        {
            System.out.print("[");
            for(int j = 0; j < groupSize; j++)
                System.out.print(((j==0)?"":",")+elements[i+j]);
            System.out.print("]");
        }
        System.out.println("");
        return; 
    }

    int combinations = nCr(remainingSize - 1, groupSize - 1);

    // decide which combination of remaining elements to pair the first element with:
    int[] combination = getKthCombination(remainingSize - 1, groupSize - 1, index % combinations);

    // swap elements into place
    for(int i = 0; i < groupSize - 1; i++)
    {
        String temp = elements[start + 1 + i];
        elements[start + 1 + i] = elements[start + 1 + combination[i]];
        elements[start + 1 + combination[i]] = temp;
    }

    generateGrouping(elements, groupSize, start + groupSize, index / combinations);

    // swap them back:
    for(int i = groupSize - 2; i >= 0; i--)
    {
        String temp = elements[start + 1 + i];
        elements[start + 1 + i] = elements[start + 1 + combination[i]];
        elements[start + 1 + combination[i]] = temp;
    }
}

public static void getKthCombination(int n, int r, int k, int[] c, int start, int offset)
{
    if(r == 0)
        return;
    if(r == n)
    {
        for(int i = 0; i < r; i++)
            c[start + i] = i + offset;
        return;
    }
    int count = nCr(n - 1, r - 1);
    if(k < count)
    {
        c[start] = offset;
        getKthCombination(n-1, r-1, k, c, start + 1, offset + 1);
        return;
    }
    getKthCombination(n-1, r, k-count, c, start, offset + 1);
}

public static int[] getKthCombination(int n, int r, int k)
{
    int[] c = new int[r];
    getKthCombination(n, r, k, c, 0, 0);

    return c;
}

Demo

The start parameter is just how far along the list you are, so pass zero when calling the function at the top level. The function could easily be rewritten to be iterative. You could also pass an array of indices instead of an array of objects that you want to group, if swapping the objects is a large overhead.

196

answered Nov 15 '22 08:11

samgak

What you call "assignations" are partitions with a fixed number of equally sized parts. Well, mostly. You didn't specify what should happen if (# of groups) * (size of each group) is less than or greater than your population size.

Generating every possible partition in a non-specific order is not too difficult, but it is only good for small populations or for filtering and finding any partition that matches some independent criteria. If you need to optimize or minimize something, you'll end up looking at the whole set of partitions, which may not be feasible.

Based on the description of your actual problem, you want to read up on local search and optimization algorithms, of which the aforementioned simulated annealing is one such technique.

With all that said, here is a simple recursive Python function that generates fixed-length partitions with equal-sized parts in no particular order. It is a specialization of my answer to a similar partition problem, and that answer is itself a specialization of this answer. It should be fairly straightforward to translate into JavaScript (with ES6 generators).

def special_partitions(population, num_groups, group_size):
    """Yields all partitions with a fixed number of equally sized parts.

    Each yielded partition is a list of length `num_groups`, 
    and each part a tuple of length `group_size.
    """
    assert len(population) == num_groups * group_size
    groups = []  # a list of lists, currently empty 

    def assign(i):
        if i >= len(population): 
            yield list(map(tuple, groups))
        else:
            # try to assign to an existing group, if possible
            for group in groups:
                if len(group) < group_size:
                    group.append(population[i])
                    yield from assign(i + 1)
                    group.pop()

            # assign to an entirely new group, if possible
            if len(groups) < num_groups:
                groups.append([population[i]])
                yield from assign(i + 1)
                groups.pop()

    yield from assign(0)

for partition in special_partitions('ABCD', 2, 2):
    print(partition)

print()

for partition in special_partitions('ABCDEF', 2, 3):
    print(partition)

When executed, this prints:

[('A', 'B'), ('C', 'D')]
[('A', 'C'), ('B', 'D')]
[('A', 'D'), ('B', 'C')]

[('A', 'B', 'C'), ('D', 'E', 'F')]
[('A', 'B', 'D'), ('C', 'E', 'F')]
[('A', 'B', 'E'), ('C', 'D', 'F')]
[('A', 'B', 'F'), ('C', 'D', 'E')]
[('A', 'C', 'D'), ('B', 'E', 'F')]
[('A', 'C', 'E'), ('B', 'D', 'F')]
[('A', 'C', 'F'), ('B', 'D', 'E')]
[('A', 'D', 'E'), ('B', 'C', 'F')]
[('A', 'D', 'F'), ('B', 'C', 'E')]
[('A', 'E', 'F'), ('B', 'C', 'D')]

answered Nov 15 '22 09:11

lazy dog

Let's say we have a total of N elements that we want to organize in G groups of E (with G*E = N). Neither the order of the groups nor the order of the elements within groups matter. The end goal is to produce every solution in a random order, knowing that we cannot store every solution at once.

First, let's think about how to produce one solution. Since order doesn't matter, we can normalize any solution by sorting the elements within groups as well as the groups themselves, by their first element.

For instance, if we consider the population {A, B, C, D}, with N = 4, G = 2, E = 2, then the solution {B,D}, {C,A} can be normalized as {A,C}, {B,D}. The elements are sorted within each group (A before C), and the groups are sorted (A before B).

When the solutions are normalized, the first element of the first group is always the first element of the population. The second element is one of the N-1 remaining, the third element is one of the N-2 remaining, and so on, except these elements must remain sorted. So there are (N-1)!/((N-E)!*(E-1)!) possibilities for the first group.

Similarly, the first element of the next groups are fixed : they are the first of the remaining elements after each group has been created. Thus, the number of possibilities for the (n+1)th group (n from 0 to G-1) is (N-nE-1)!/((N-(n+1)E)!*(E-1)!) = ((G-n)E-1)!/(((G-n-1)E)!*(E-1)!).

This gives us one possible way of indexing a solution. The index is not a single integer, but rather an array of G integers, the integer n (still from 0 to G-1) being in the range 1 to (N-nE-1)!/((N-nE-E)!*(E-1)!), and representing the group n (or "(n+1)th group") of the solution. This is easy to produce randomly and to check for duplicates.

The last thing we need to find is a way to produce a group from a corresponding integer, n. We need to choose E-1 elements from the N-nE-1 remaining. At this point, you can imagine listing every combination and choosing the (n+1)th one. Of course, this can be done without generating every combination : see this question.

For curiosity, the total number of solutions is (GE)!/(G!*(E!)^G).
In your example, it is (2*2)!/(2!*(2!)^2) = 3.
For N = 200 and E = 2, there are 6.7e186 solutions.
For N = 200 and E = 5, there are 6.6e243 solutions (the maximum I found for 200 elements).

Additionally, for N = 200 and E > 13, the number of possibilities for the first group is greater than 2^64 (so it cannot be stored in a 64-bit integer), which is problematic for representing an index. But as long as you don't need groups with more than 13 elements, you can use arrays of 64-bit integers as indices.

answered Nov 15 '22 08:11

Nelfeal

Related questions
                            
                                Knuth the art of computer programming ex 1.1.8
                            
                                Algorithm to smooth a curve while keeping the area under it constant
                            
                                Finding the best move using MinMax with Alpha-Beta pruning
                            
                                Page Rank in Python
                            
                                Binary Search solution for Max Number of Surpassers
                            
                                Eugene Myers' Diff Algorithm: Finding the Longest Common Subsequence of "A" and "B"
                            
                                Finding reachable vertices for every vertex in a directed graph
                            
                                Fast way of generating combinations with constraints?
                            
                                Is Kadane's Algorithm Greedy or Optimised DP?
                            
                                Winner of a tournament in O(N) and rank of the players in O(NLogN)
                            
                                Check if a given sequence of moves for a robot is circular or not
                            
                                List of all edge disjoint paths in a tree
                            
                                Matching algorithm
                            
                                Finding shortest path in a graph, with additional restrictions
                            
                                Why is this simple O(n) Haskell algorithm behaving more like O(2^n)? [duplicate]
                            
                                What's the most efficient way to find which elements of one array are close to any element in another?
                            
                                Calculate the closest colourblind-friendly colour?
                            
                                Need help in understanding Dynamic Programming approach for "balanced 0-1 matrix"?
                            
                                Given an array of 0 and 1, find minimum no. of swaps to bring all 1s together (only adjacent swaps allowed)
                            
                                Choose the best cluster partition based on a cost function

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to generate all set combinations in a random order

Tags:

algorithm

random

set

combinatorics

combinations