Subset sum problem

Tags:

I'm having a problem with counting which is continuation of this question. I am not really a math person so it's really hard for me to figure out this subset sum problem which was suggested as resolution.

I'm having 4 ArrayList in which I hold data: alId, alTransaction, alNumber, alPrice

Type | Transaction | Number | Price
8 | Buy | 95.00000000 | 305.00000000
8 | Buy | 126.00000000 | 305.00000000
8 | Buy | 93.00000000 | 306.00000000
8 | Transfer out | 221.00000000 | 305.00000000
8 | Transfer in | 221.00000000 | 305.00000000
8 | Sell | 93.00000000 | 360.00000000
8 | Sell | 95.00000000 | 360.00000000
8 | Sell | 126.00000000 | 360.00000000
8 | Buy | 276.00000000 | 380.00000000

In the end I'm trying to get what's left for customer and what's left I put into 3 array lists:
- alNewHowMuch (corresponds to alNumber),
- alNewPrice (corresponds to alPrice),
- alNewInID (corrseponds to alID)

Click to copy

        ArrayList alNewHowMuch = new ArrayList();
        ArrayList alNewPrice = new ArrayList();
        ArrayList alNewInID = new ArrayList();
        for (int i = 0; i < alTransaction.Count; i++) {
            string transaction = (string) alTransaction[i];
            string id = (string) alID[i];
            decimal price = (decimal) alPrice[i];
            decimal number = (decimal) alNumber[i];
            switch (transaction) {
                case "Transfer out":
                case "Sell":
                    int index = alNewHowMuch.IndexOf(number);
                    if (index != -1) {
                        alNewHowMuch.RemoveAt(index);
                        alNewPrice.RemoveAt(index);
                        alNewInID.RemoveAt(index);
                    } else {
                        ArrayList alTemp = new ArrayList();
                        decimal sum = 0;
                        for (int j = 0; j < alNewHowMuch.Count; j ++) {
                            string tempid = (string) alNewInID[j];
                            decimal tempPrice = (decimal) alNewPrice[j];
                            decimal tempNumbers = (decimal) alNewHowMuch[j];
                            if (id == tempid && tempPrice == price) {
                                alTemp.Add(j);
                                sum = sum + tempNumbers;
                            }
                        }
                        if (sum == number) {
                            for (int j = alTemp.Count - 1; j >= 0; j --) {
                                int tempIndex = (int) alTemp[j];
                                alNewHowMuch.RemoveAt(tempIndex);
                                alNewPrice.RemoveAt(tempIndex);
                                alNewInID.RemoveAt(tempIndex);
                            }
                        }
                    }
                    break;
                case "Transfer In":
                case "Buy":
                    alNewHowMuch.Add(number);
                    alNewPrice.Add(price);
                    alNewInID.Add(id);
                    break;
            }
        }

Basically I'm adding and removing things from Array depending on Transaction Type, Transaction ID and Numbers. I'm adding numbers to ArrayList like 156, 340 (when it is TransferIn or Buy) etc and then i remove them doing it like 156, 340 (when it's TransferOut, Sell). My solution works for that without a problem. The problem I have is that for some old data employees were entering sum's like 1500 instead of 500+400+100+500. How would I change it so that when there's Sell/TransferOut or Buy/Transfer In and there's no match inside ArrayList it should try to add multiple items from thatArrayList and find elements that combine into aggregate.

Inside my code I tried to resolve that problem with simple summing everything when there's no match (index == 1)

Click to copy

                    int index = alNewHowMuch.IndexOf(number);
                    if (index != -1) {
                        alNewHowMuch.RemoveAt(index);
                        alNewPrice.RemoveAt(index);
                        alNewInID.RemoveAt(index);
                    } else {
                        ArrayList alTemp = new ArrayList();
                        decimal sum = 0;
                        for (int j = 0; j < alNewHowMuch.Count; j ++) {
                            string tempid = (string) alNewInID[j];
                            decimal tempPrice = (decimal) alNewPrice[j];
                            decimal tempNumbers = (decimal) alNewHowMuch[j];
                            if (id == tempid && tempPrice == price) {
                                alTemp.Add(j);
                                sum = sum + tempNumbers;
                            }
                        }
                        if (sum == number) {
                            for (int j = alTemp.Count - 1; j >= 0; j --) {
                                int tempIndex = (int) alTemp[j];
                                alNewHowMuch.RemoveAt(tempIndex);
                                alNewPrice.RemoveAt(tempIndex);
                                alNewInID.RemoveAt(tempIndex);
                            }
                        }
                    }

But it only works if certain conditions are met, and fails for the rest.

Edit: Since some of you were so astonished (and blinded) by my polish variable names i translated all of them to english for simplicity and visiblity. Hopefully this will help me to get some help :-)

465

asked Apr 25 '10 13:04

MadBoy

2 Answers

Here is my algorithm. It runs in O(2^(n/2)) and solves SubsetSum(1000, list-of-1000-ones) in 20 milliseconds. See the comments at the end of IVlad's post.

Click to copy

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;

namespace SubsetSum
{
    class Program
    {
        static void Main(string[] args)
        {
            var ns = new List<int>();
            for (int i = 0; i < 1000; i++) ns.Add(1);
            var s1 = Stopwatch.StartNew();
            bool result = SubsetSum(ns, 1000);
            s1.Stop();
            Console.WriteLine(result);
            Console.WriteLine(s1.Elapsed);
            Console.Read();
        }

        static bool SubsetSum(ist<int> nums, int targetL)
        {
            var left = new List<int> { 0 };
            var right = new List<int> { 0 };
            foreach (var n in nums)
            {
                if (left.Count < right.Count) left = Insert(n, left);
                else right = Insert(n, right);
            }
            int lefti = 0, righti = right.Count - 1;
            while (lefti < left.Count && righti >= 0)
            {
                int s = left[lefti] + right[righti];
                if (s < target) lefti++;
                else if (s > target) righti--;
                else return true;
            }
            return false;
        }

        static List<int> Insert(int num, List<int> nums)
        {
            var result = new List<int>();
            int lefti = 0, left = nums[0]+num;
            for (var righti = 0; righti < nums.Count; righti++)
            {

                int right = nums[righti];
                while (left < right)
                {
                    result.Add(left);
                    left = nums[++lefti] + num;
                }
                if (right != left) result.Add(right);
            }
            while (lefti < nums.Count) result.Add(nums[lefti++] + num);
            return result;
        }
    }
}

And here is an improved version that prunes the sets:

Click to copy

static bool SubsetSum(List<int> nums, int target)
{
    var remainingSum = nums.Sum();
    var left = new List<int> { 0 };
    var right = new List<int> { 0 };
    foreach (var n in nums)
    {
        if (left.Count == 0 || right.Count == 0) return false;
        remainingSum -= n;
        if (left.Count < right.Count) left = Insert(n, left, target - remainingSum - right.Last(), target);
        else right = Insert(n, right, target - remainingSum - left.Last(), target);
    }
    int lefti = 0, righti = right.Count - 1;
    while (lefti < left.Count && righti >= 0)
    {
        int s = left[lefti] + right[righti];
        if (s < target) lefti++;
        else if (s > target) righti--;
        else return true;
    }
    return false;
}

static List<int> Insert(int num, List<int> nums, int min, int max)
{
    var result = new List<int>();
    int lefti = 0, left = nums[0]+num;
    for (var righti = 0; righti < nums.Count; righti++)
    {

        int right = nums[righti];
        while (left < right)
        {
            if (min <= left && left <= max) result.Add(left);
            left = nums[++lefti] + num;
        }
        if (right != left && min <= right && right <= max) result.Add(right);
    }
    while (lefti < nums.Count)
    {
        left = nums[lefti++] + num;
        if (min <= left && left <= max) result.Add(left);
    } 
    return result;
}

This last one solved the problem with 100000 ones in about 5 milliseconds (but this is a best case of the algorithm, with real world data it will be slower).

For your use this algorithm is probably fast enough (and I don't see any obvious improvements). If you enter 10,000 products with a random price between 0 and 20 and your goal is to sum to 500 this is solved in 0.04 seconds on my laptop.

Edit: I just read on Wikipedia that the best known algorithm is O(2^(n/2)*n). This one is O(2^(n/2)). Do I get the Turing Award?

answered Oct 03 '22 14:10

Jules

How you should do this depends on a number important things: how many numbers will you have and how big will they be? Also, as far as I understand, your data can change (add / remove numbers etc.), right?. How often do you need to make these queries?

I'll present two solutions. I suggest you use the second, as I suspect it's better for what you need and it's a lot easier to understand.

Solution 1 - dynamic programming

Let S[i] = true if we can make sum i and false otherwise.

Click to copy

S[0] = true // we can always make sum 0: just don't choose any number
S[i] = false for all i != 0
for each number i in your input
    for s = MaxSum downto i
        if ( S[s - i] == true )
            S[s] = true; // if we can make the sum s - i, we can also make the sum s by adding i to the sum s - i.

To get the actual numbers that make up your sum you should keep another vector P[i] = the last number that was used to make sum i. You would update this accordingly in the if condition above.

The time complexity of this is O(numberOfNumbers * maxSumOfAllNumbers), which is pretty bad, especially since you have to rerun this algorithm whenever your data changes. It's also slow for even one run as long as your numbers can be very big and you can have a lot of them. In fact, "a lot" is misleading. If you have 100 numbers and each number can be as big as 10 000, you will do roughly 100 * 10 000 = 1 000 000 operations each time your data changes.

It's a good solution to know, but not really useful in practice, or at least not in your case I think.

He's some C# for the approach I suggest:

Click to copy

   class Program
      {
        static void Main(string[] args)
        {
            List<int> testList = new List<int>();

            for (int i = 0; i < 1000; ++i)
            {
                testList.Add(1);
            }

            Console.WriteLine(SubsetSum.Find(testList, 1000));

            foreach (int index in SubsetSum.GetLastResult(1000))
            {
                Console.WriteLine(index);
            }
        }
    }

    static class SubsetSum
    {
        private static Dictionary<int, bool> memo;
        private static Dictionary<int, KeyValuePair<int, int>> prev;

        static SubsetSum()
        {
            memo = new Dictionary<int, bool>();
            prev = new Dictionary<int, KeyValuePair<int, int>>();
        }

        public static bool Find(List<int> inputArray, int sum)
        {
            memo.Clear();
            prev.Clear();

            memo[0] = true;
            prev[0] = new KeyValuePair<int,int>(-1, 0);

            for (int i = 0; i < inputArray.Count; ++i)
            {
                int num = inputArray[i];
                for (int s = sum; s >= num; --s)
                {
                    if (memo.ContainsKey(s - num) && memo[s - num] == true)
                    {
                        memo[s] = true;

                        if (!prev.ContainsKey(s))
                        {
                            prev[s] = new KeyValuePair<int,int>(i, num);
                        }
                    }
                }
            }

            return memo.ContainsKey(sum) && memo[sum];
        }

        public static IEnumerable<int> GetLastResult(int sum)
        {
            while (prev[sum].Key != -1)
            {
                yield return prev[sum].Key;
                sum -= prev[sum].Value;
            }
        }
    }

You should do some error checking perhaps, and maybe store the last sum in the class so as not to allow the possibility of calling GetLastResult with a different sum than the sum Find was last called with. Anyway, this is the idea.

Solution 2 - randomized algorithm

Now, this is easier. Keep two lists: usedNums and unusedNums. Also keep a variable usedSum that, at any point in time, contains the sum of all the numbers in the usedNums list.

Whenever you need to insert a number into your set, also add it to one of the two lists (doesn't matter which, but do it randomly so there's a relatively even distribution). Update usedSum accordingly.

Whenever you need to remove a number from your set, find out which of the two lists it's in. You can do this with a linear seach as long as you don't have a lot (this time a lot means over 10 000, maybe even 100 000 on a fast computer and assuming you don't do this operation often and in fast succession. Anyway, the linear search can be optimized if you need it to be.). Once you have found the number, remove it from the list. Update usedSum accordingly.

Whenever you need to find if there are numbers in your set that sum to a number S, use this algorithm:

Click to copy

while S != usedSum
    if S > usedSum // our current usedSum is too small
        move a random number from unusedNums to usedNums and update usedSum
    else // our current usedSum is too big
        move a random number from usedNums to unusedNums and update usedSum

At the end of the algorithm, the list usedNums will give you the numbers whose sum is S.

This algorithm should be good for what you need, I think. It handles changes to the dataset very well and works well for a high number count. It also doesn't depend on how big the numbers are, which is very useful if you have big numbers.

Please post if you have any questions.

answered Oct 03 '22 14:10

IVlad

Related questions
                            
                                Which Audio Codec should I use?
                            
                                AtomicReference in Java - necessary for setting a reference in a thread-safe environment?
                            
                                Detect silent install in .NET Custom Action
                            
                                Generate word combination array in c#
                            
                                ProcessCmdKey - wait for KeyUp?
                            
                                Modifying C# dictionary value
                            
                                how to tell in code if we are running from a testmethod .net test
                            
                                How do I terminate a .NET thread that has had its call stack corrupted?
                            
                                How to get DayNames from language only in .NET
                            
                                reporting services: use a custom assembly with a local (RDLC) report
                            
                                Dynamics of the using keyword
                            
                                Using MVVM, what to do when a control doesn't have a "Command" property?
                            
                                Is there something like SESSION in Windows application?
                            
                                In which cases it makes sense to use factory classes instead of static functions?
                            
                                Wcf exception handling
                            
                                Need to analyze C# code to find unreferenced classes
                            
                                Check if a table exists, and if not, create it
                            
                                How to update the progress bar in runtime using c#
                            
                                Problems with office automation in asp.net. I can use alternatives such as open-office, if I knew how
                            
                                Get groups of 4 elements from name value list using LINQ in C#

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Subset sum problem

Tags:

c#

algorithm

sql-server

.net-3.5

MadBoy

People also ask

2 Answers

Jules

IVlad

Recent Activity

Donate For Us