Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Algorithm to return all combinations of k elements from n

People also ask

How do you generate all possible combinations of one list?

Add a Custom Column to and name it List1. Enter the formula =List1. Expand out the new List1 column and then Close & Load the query to a table. The table will have all the combinations of items from both lists and we saved on making a custom column in List1 and avoided using a merge query altogether!


Art of Computer Programming Volume 4: Fascicle 3 has a ton of these that might fit your particular situation better than how I describe.

Gray Codes

An issue that you will come across is of course memory and pretty quickly, you'll have problems by 20 elements in your set -- 20C3 = 1140. And if you want to iterate over the set it's best to use a modified gray code algorithm so you aren't holding all of them in memory. These generate the next combination from the previous and avoid repetitions. There are many of these for different uses. Do we want to maximize the differences between successive combinations? minimize? et cetera.

Some of the original papers describing gray codes:

  1. Some Hamilton Paths and a Minimal Change Algorithm
  2. Adjacent Interchange Combination Generation Algorithm

Here are some other papers covering the topic:

  1. An Efficient Implementation of the Eades, Hickey, Read Adjacent Interchange Combination Generation Algorithm (PDF, with code in Pascal)
  2. Combination Generators
  3. Survey of Combinatorial Gray Codes (PostScript)
  4. An Algorithm for Gray Codes

Chase's Twiddle (algorithm)

Phillip J Chase, `Algorithm 382: Combinations of M out of N Objects' (1970)

The algorithm in C...

Index of Combinations in Lexicographical Order (Buckles Algorithm 515)

You can also reference a combination by its index (in lexicographical order). Realizing that the index should be some amount of change from right to left based on the index we can construct something that should recover a combination.

So, we have a set {1,2,3,4,5,6}... and we want three elements. Let's say {1,2,3} we can say that the difference between the elements is one and in order and minimal. {1,2,4} has one change and is lexicographically number 2. So the number of 'changes' in the last place accounts for one change in the lexicographical ordering. The second place, with one change {1,3,4} has one change but accounts for more change since it's in the second place (proportional to the number of elements in the original set).

The method I've described is a deconstruction, as it seems, from set to the index, we need to do the reverse – which is much trickier. This is how Buckles solves the problem. I wrote some C to compute them, with minor changes – I used the index of the sets rather than a number range to represent the set, so we are always working from 0...n. Note:

  1. Since combinations are unordered, {1,3,2} = {1,2,3} --we order them to be lexicographical.
  2. This method has an implicit 0 to start the set for the first difference.

Index of Combinations in Lexicographical Order (McCaffrey)

There is another way:, its concept is easier to grasp and program but it's without the optimizations of Buckles. Fortunately, it also does not produce duplicate combinations:

The set x_k...x_1 in N that maximizes i = C(x_1,k) + C(x_2,k-1) + ... + C(x_k,1), where C(n,r) = {n choose r}.

For an example: 27 = C(6,4) + C(5,3) + C(2,2) + C(1,1). So, the 27th lexicographical combination of four things is: {1,2,5,6}, those are the indexes of whatever set you want to look at. Example below (OCaml), requires choose function, left to reader:

(* this will find the [x] combination of a [set] list when taking [k] elements *)
let combination_maccaffery set k x =
    (* maximize function -- maximize a that is aCb              *)
    (* return largest c where c < i and choose(c,i) <= z        *)
    let rec maximize a b x =
        if (choose a b ) <= x then a else maximize (a-1) b x
    in
    let rec iterate n x i = match i with
        | 0 -> []
        | i ->
            let max = maximize n i x in
            max :: iterate n (x - (choose max i)) (i-1)
    in
    if x < 0 then failwith "errors" else
    let idxs =  iterate (List.length set) x k in
    List.map (List.nth set) (List.sort (-) idxs)

A small and simple combinations iterator

The following two algorithms are provided for didactic purposes. They implement an iterator and (a more general) folder overall combinations. They are as fast as possible, having the complexity O(nCk). The memory consumption is bound by k.

We will start with the iterator, which will call a user provided function for each combination

let iter_combs n k f =
  let rec iter v s j =
    if j = k then f v
    else for i = s to n - 1 do iter (i::v) (i+1) (j+1) done in
  iter [] 0 0

A more general version will call the user provided function along with the state variable, starting from the initial state. Since we need to pass the state between different states we won't use the for-loop, but instead, use recursion,

let fold_combs n k f x =
  let rec loop i s c x =
    if i < n then
      loop (i+1) s c @@
      let c = i::c and s = s + 1 and i = i + 1 in
      if s < k then loop i s c x else f c x
    else x in
  loop 0 0 [] x

In C#:

public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int k)
{
  return k == 0 ? new[] { new T[0] } :
    elements.SelectMany((e, i) =>
      elements.Skip(i + 1).Combinations(k - 1).Select(c => (new[] {e}).Concat(c)));
}

Usage:

var result = Combinations(new[] { 1, 2, 3, 4, 5 }, 3);

Result:

123
124
125
134
135
145
234
235
245
345

Short java solution:

import java.util.Arrays;

public class Combination {
    public static void main(String[] args){
        String[] arr = {"A","B","C","D","E","F"};
        combinations2(arr, 3, 0, new String[3]);
    }

    static void combinations2(String[] arr, int len, int startPosition, String[] result){
        if (len == 0){
            System.out.println(Arrays.toString(result));
            return;
        }       
        for (int i = startPosition; i <= arr.length-len; i++){
            result[result.length - len] = arr[i];
            combinations2(arr, len-1, i+1, result);
        }
    }       
}

Result will be

[A, B, C]
[A, B, D]
[A, B, E]
[A, B, F]
[A, C, D]
[A, C, E]
[A, C, F]
[A, D, E]
[A, D, F]
[A, E, F]
[B, C, D]
[B, C, E]
[B, C, F]
[B, D, E]
[B, D, F]
[B, E, F]
[C, D, E]
[C, D, F]
[C, E, F]
[D, E, F]

May I present my recursive Python solution to this problem?

def choose_iter(elements, length):
    for i in xrange(len(elements)):
        if length == 1:
            yield (elements[i],)
        else:
            for next in choose_iter(elements[i+1:len(elements)], length-1):
                yield (elements[i],) + next
def choose(l, k):
    return list(choose_iter(l, k))

Example usage:

>>> len(list(choose_iter("abcdefgh",3)))
56

I like it for its simplicity.


Lets say your array of letters looks like this: "ABCDEFGH". You have three indices (i, j, k) indicating which letters you are going to use for the current word, You start with:

A B C D E F G H
^ ^ ^
i j k

First you vary k, so the next step looks like that:

A B C D E F G H
^ ^   ^
i j   k

If you reached the end you go on and vary j and then k again.

A B C D E F G H
^   ^ ^
i   j k

A B C D E F G H
^   ^   ^
i   j   k

Once you j reached G you start also to vary i.

A B C D E F G H
  ^ ^ ^
  i j k

A B C D E F G H
  ^ ^   ^
  i j   k
...

Written in code this look something like that

void print_combinations(const char *string)
{
    int i, j, k;
    int len = strlen(string);

    for (i = 0; i < len - 2; i++)
    {
        for (j = i + 1; j < len - 1; j++)
        {
            for (k = j + 1; k < len; k++)
                printf("%c%c%c\n", string[i], string[j], string[k]);
        }
    }
}

The following recursive algorithm picks all of the k-element combinations from an ordered set:

  • choose the first element i of your combination
  • combine i with each of the combinations of k-1 elements chosen recursively from the set of elements larger than i.

Iterate the above for each i in the set.

It is essential that you pick the rest of the elements as larger than i, to avoid repetition. This way [3,5] will be picked only once, as [3] combined with [5], instead of twice (the condition eliminates [5] + [3]). Without this condition you get variations instead of combinations.


Short example in Python:

def comb(sofar, rest, n):
    if n == 0:
        print sofar
    else:
        for i in range(len(rest)):
            comb(sofar + rest[i], rest[i+1:], n-1)

>>> comb("", "abcde", 3)
abc
abd
abe
acd
ace
ade
bcd
bce
bde
cde

For explanation, the recursive method is described with the following example:

Example: A B C D E
All combinations of 3 would be:

  • A with all combinations of 2 from the rest (B C D E)
  • B with all combinations of 2 from the rest (C D E)
  • C with all combinations of 2 from the rest (D E)