Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fast solution to Subset sum

Consider this way of solving the Subset sum problem:

def subset_summing_to_zero (activities):
  subsets = {0: []}
  for (activity, cost) in activities.iteritems():
      old_subsets = subsets
      subsets = {}
      for (prev_sum, subset) in old_subsets.iteritems():
          subsets[prev_sum] = subset
          new_sum = prev_sum + cost
          new_subset = subset + [activity]
          if 0 == new_sum:
              new_subset.sort()
              return new_subset
          else:
              subsets[new_sum] = new_subset
  return []

I have it from here:

http://news.ycombinator.com/item?id=2267392

There is also a comment which says that it is possible to make it "more efficient".

How?

Also, are there any other ways to solve the problem which are at least as fast as the one above?

Edit

I'm interested in any kind of idea which would lead to speed-up. I found:

https://en.wikipedia.org/wiki/Subset_sum_problem#cite_note-Pisinger09-2

which mentions a linear time algorithm. But I don't have the paper, perhaps you, dear people, know how it works? An implementation perhaps? Completely different approach perhaps?

Edit 2

There is now a follow-up:
Fast solution to Subset sum algorithm by Pisinger

like image 654
Ecir Hana Avatar asked Mar 21 '12 17:03

Ecir Hana


People also ask

How do you solve subset sums?

The SUBSET-SUM problem involves determining whether or not a subset from a list of integers can sum to a target value. For example, consider the list of nums = [1, 2, 3, 4] . If the target = 7 , there are two subsets that achieve this sum: {3, 4} and {1, 2, 4} . If target = 11 , there are no solutions.

Is subset sum NP-hard?

SSP can also be regarded as an optimization problem: find a subset whose sum is at most T, and subject to that, as close as possible to T. It is NP-hard, but there are several algorithms that can solve it reasonably quickly in practice.

What is the time complexity of subset sum problem?

Time Complexity: O(N * sum) where N is the size of the array. Space Complexity: O(N * sum) where N is the size of the array.

Can we solve sum of subset problem using dynamic programming?

Subset Sum Problem | DP-25 Given a set of non-negative integers, and a value sum, determine if there is a subset of the given set with sum equal to given sum. Example: Input: set[] = {3, 34, 4, 12, 5, 2}, sum = 9 Output: True There is a subset (4, 5) with sum 9.


1 Answers

I respect the alacrity with which you're trying to solve this problem! Unfortunately, you're trying to solve a problem that's NP-complete, meaning that any further improvement that breaks the polynomial time barrier will prove that P = NP.

The implementation you pulled from Hacker News appears to be consistent with the pseudo-polytime dynamic programming solution, where any additional improvements must, by definition, progress the state of current research into this problem and all of its algorithmic isoforms. In other words: while a constant speedup is possible, you're very unlikely to see an algorithmic improvement to this solution to the problem in the context of this thread.

However, you can use an approximate algorithm if you require a polytime solution with a tolerable degree of error. In pseudocode blatantly stolen from Wikipedia, this would be:

initialize a list S to contain one element 0.
 for each i from 1 to N do
   let T be a list consisting of xi + y, for all y in S
   let U be the union of T and S
   sort U
   make S empty 
   let y be the smallest element of U 
   add y to S 
   for each element z of U in increasing order do
      //trim the list by eliminating numbers close to one another
      //and throw out elements greater than s
     if y + cs/N < z ≤ s, set y = z and add z to S 
 if S contains a number between (1 − c)s and s, output yes, otherwise no

Python implementation, preserving the original terms as closely as possible:

from bisect import bisect

def ssum(X,c,s):
    """ Simple impl. of the polytime approximate subset sum algorithm 
    Returns True if the subset exists within our given error; False otherwise 
    """
    S = [0]
    N = len(X)
    for xi in X:
        T = [xi + y for y in S]
        U = set().union(T,S)
        U = sorted(U) # Coercion to list
        S = []
        y = U[0]
        S.append(y)
        for z in U: 
            if y + (c*s)/N < z and z <= s:
                y = z
                S.append(z)
    if not c: # For zero error, check equivalence
        return S[bisect(S,s)-1] == s
    return bisect(S,(1-c)*s) != bisect(S,s)

... where X is your bag of terms, c is your precision (between 0 and 1), and s is the target sum.

For more details, see the Wikipedia article.

(Additional reference, further reading on CSTheory.SE)

like image 126
MrGomez Avatar answered Sep 28 '22 18:09

MrGomez