Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find out which combinations of numbers in a set add up to a given total

Tags:

I've been tasked with helping some accountants solve a common problem they have - given a list of transactions and a total deposit, which transactions are part of the deposit? For example, say I have this list of numbers:

1.00 2.50 3.75 8.00 

And I know that my total deposit is 10.50, I can easily see that it's made up of the 8.00 and 2.50 transaction. However, given a hundred transactions and a deposit in the millions, it quickly becomes much more difficult.

In testing a brute force solution (which takes way too long to be practical), I had two questions:

  1. With a list of about 60 numbers, it seems to find a dozen or more combinations for any total that's reasonable. I was expecting a single combination to satisfy my total, or maybe a few possibilities, but there always seem to be a ton of combinations. Is there a math principle that describes why this is? It seems that given a collection of random numbers of even a medium size, you can find a multiple combination that adds up to just about any total you want.

  2. I built a brute force solution for the problem, but it's clearly O(n!), and quickly grows out of control. Aside from the obvious shortcuts (exclude numbers larger than the total themselves), is there a way to shorten the time to calculate this?

Details on my current (super-slow) solution:

The list of detail amounts is sorted largest to smallest, and then the following process runs recursively:

  • Take the next item in the list and see if adding it to your running total makes your total match the target. If it does, set aside the current chain as a match. If it falls short of your target, add it to your running total, remove it from the list of detail amounts, and then call this process again

This way it excludes the larger numbers quickly, cutting the list down to only the numbers it needs to consider. However, it's still n! and larger lists never seem to finish, so I'm interested in any shortcuts I might be able to take to speed this up - I suspect that even cutting 1 number out of the list would cut the calculation time in half.

Thanks for your help!

like image 286
SqlRyan Avatar asked Oct 21 '10 16:10

SqlRyan


People also ask

Can Excel find a combinations that equal given sum?

Find cells combination that equal a given sum with Solver Add-in. If you are confused with above method, Excel contains a Solver Add-in feature, by using this add-in, you can also identify the numbers which total amount equals a given value.

How do you find the sum of all possible combinations?

The sum of all possible combinations of n distinct things is 2 n. C0 + nC1 + nC2 + . . . + nC n = 2 n.


2 Answers

This special case of the Knapsack problem is called Subset Sum.

like image 157
Falk Hüffner Avatar answered Oct 11 '22 14:10

Falk Hüffner


C# version

setup test:

using System; using System.Collections.Generic;  public class Program {     public static void Main(string[] args)     {     // subtotal list     List<double> totals = new List<double>(new double[] { 1, -1, 18, 23, 3.50, 8, 70, 99.50, 87, 22, 4, 4, 100.50, 120, 27, 101.50, 100.50 });      // get matches     List<double[]> results = Knapsack.MatchTotal(100.50, totals);      // print results     foreach (var result in results)     {         Console.WriteLine(string.Join(",", result));     }      Console.WriteLine("Done.");     Console.ReadKey();     } } 

code:

using System.Collections.Generic; using System.Linq;  public class Knapsack {     internal static List<double[]> MatchTotal(double theTotal, List<double> subTotals)     {     List<double[]> results = new List<double[]>();      while (subTotals.Contains(theTotal))     {         results.Add(new double[1] { theTotal });         subTotals.Remove(theTotal);     }      // if no subtotals were passed     // or all matched the Total     // return     if (subTotals.Count == 0)         return results;      subTotals.Sort();      double mostNegativeNumber = subTotals[0];     if (mostNegativeNumber > 0)         mostNegativeNumber = 0;      // if there aren't any negative values     // we can remove any values bigger than the total     if (mostNegativeNumber == 0)         subTotals.RemoveAll(d => d > theTotal);      // if there aren't any negative values     // and sum is less than the total no need to look further     if (mostNegativeNumber == 0 && subTotals.Sum() < theTotal)         return results;      // get the combinations for the remaining subTotals     // skip 1 since we already removed subTotals that match     for (int choose = 2; choose <= subTotals.Count; choose++)     {         // get combinations for each length         IEnumerable<IEnumerable<double>> combos = Combination.Combinations(subTotals.AsEnumerable(), choose);          // add combinations where the sum mathces the total to the result list         results.AddRange(from combo in combos                  where combo.Sum() == theTotal                  select combo.ToArray());     }      return results;     } }  public static class Combination {     public static IEnumerable<IEnumerable<T>> Combinations<T>(this IEnumerable<T> elements, int choose)     {     return choose == 0 ?                        // if choose = 0         new[] { new T[0] } :                    // return empty Type array         elements.SelectMany((element, i) =>     // else recursively iterate over array to create combinations         elements.Skip(i + 1).Combinations(choose - 1).Select(combo => (new[] { element }).Concat(combo)));     } } 

results:

100.5 100.5 -1,101.5 1,99.5 3.5,27,70 3.5,4,23,70 3.5,4,23,70 -1,1,3.5,27,70 1,3.5,4,22,70 1,3.5,4,22,70 1,3.5,8,18,70 -1,1,3.5,4,23,70 -1,1,3.5,4,23,70 1,3.5,4,4,18,70 -1,3.5,8,18,22,23,27 -1,3.5,4,4,18,22,23,27 Done. 

If subTotals are repeated, there will appear to be duplicate results (the desired effect). In reality, you will probably want to use the subTotal Tupled with some ID, so you can relate it back to your data.

like image 25
Dan Avatar answered Oct 11 '22 14:10

Dan