Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to understand the following C# linq code of implementing the algorithm to return all combinations of k elements from n

Anyone can elaborate some details on this code or even give a non-Linq version of this algorithm:

public static IEnumerable<IEnumerable<T>> Combinations<T>
    (this IEnumerable<T> elements, int k)
{
   return k == 0 ? new[] { new T[0] }
                 : elements.SelectMany(
                       (e, i) =>
                         elements
                         .Skip(i + 1)
                         .Combinations(k - 1)
                         .Select(c => (new[] {e}).Concat(c)));
}
like image 953
xuehui Avatar asked May 08 '15 09:05

xuehui


1 Answers

The best way to understand this code is to read the amazing serial post from Eric Lippert:

  • Producing combinations, part one
  • Producing combinations, part two
  • Producing combinations, part three
  • Producing combinations, part four
  • Producing combinations, part five

Basically, if we have a IEnumerable of 5 items, and want to get all combinations size of 3, we need to produce something like this:

{
                      // 50, 60, 70, 80, 90
    {50, 60, 70},     // T   T   T   F   F
    {50, 60, 80},     // T   T   F   T   F
    {50, 60, 90},     // T   T   F   F   T
    {50, 70, 80},     // T   F   T   T   F
    {50, 70, 90},     // T   F   T   F   T
    {50, 80, 90},     // T   F   F   T   T
    {60, 70, 80},     // F   T   T   T   F
    {60, 70, 90},     // F   T   T   F   T
    {60, 80, 90},     // F   T   F   T   T
    {70, 80, 90}      // F   F   T   T   T
}

Eric's recursive implementation:

// Takes integers n and k, both non-negative.
// Produces all sets of exactly k elements consisting only of 
// integers from 0 through n - 1.
private static IEnumerable<TinySet> Combinations(int n, int k)
{
  // Base case: if k is zero then there can be only one set
  // regardless of the value of n: the empty set is the only set
  // with zero elements.
  if (k == 0)
  { 
    yield return TinySet.Empty;
    yield break;
  }

  // Base case: if n < k then there can be no set of exactly
  // k elements containing values from 0 to n - 1, because sets
  // do not contain repeated elements.

  if (n < k)
    yield break;

  // A set containing k elements where each is an integer from
  // 0 to n - 2 is also a set of k elements where each is an
  // integer from 0 to n - 1, so yield all of those.

  foreach(var r in Combinations(n-1, k))
    yield return r;

  // If we add n - 1 to all the sets of k - 1 elements where each
  // is an integer from 0 to n - 2, then we get a set of k elements
  // where each is an integer from 0 to n - 1.

  foreach(var r in Combinations(n-1, k-1))
    yield return r.Add(n-1);
}

In your case, the code is working like this:

   return k == 0
     // if we are done, return empty array
     ? new[] {new T[0]}
     // for each element and each number from 0 to enumerable size
     : elements.SelectMany((e, i) =>
                            elements
     //skip first i elements, as we already produced combination with them
                            .Skip(i + 1)
     //get all the combinations with size k - 1
                            .Combinations(k - 1)
     //add current element to all produced combinations
                            .Select(c => (new[] {e}).Concat(c)));

This code in non-recursive form will be very huge and unreadable, try to understand the recursion:

Say, we have a 5 elements IEnumerable: { 16, 13, 2, 4, 100 }, and we need to all the combinations from it with size of 2 (total numbers of resulting sets is equal to Binomial coefficient from 5 to 2 = 5! / (2! * 3!) = 10)

Your code will be producing:

  1. For the 16 we need all the combinations of size 1, starting from second position:
  2. For the element 13 we need all the combinations of size 0 starting from the third position
  3. First result: { 16, 13 }
  4. Skip the 13, For the element 2 we need all the combinations of size 0 starting from the fourth position
  5. Second result: { 16, 2 }
  6. Skip the 13, 2, For the element 4 we need all the combinations of size 0 starting from the fifth position
  7. Third result: { 16, 4 }
  8. Skip the 13, 2, 4, For the element 100 we need all the combinations of size 0 starting from the sixth position
  9. Fourth result: { 16, 100 }
  10. ... repeat all the above from 13, 2, 4:
    { 13, 2 }, { 13, 4 }, { 13, 100 }, { 2, 4 }, { 2, 100 }, { 4, 100 }

And we got all the 10 combinations we need. The overload the code author is using is this: Enumerable.SelectMany<TSource, TResult> Method (IEnumerable<TSource>, Func<TSource, Int32, IEnumerable<TResult>>):

selector
Type: System.Func<TSource, Int32, IEnumerable<TResult>>
A transform function to apply to each source element;
the second parameter of the function represents the index of the source element.

like image 88
VMAtm Avatar answered Oct 21 '22 10:10

VMAtm