Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Subset-Sum in Linear Time

This was a question on our Algorithms final exam. It's verbatim because the prof let us take a copy of the exam home.

  1. (20 points) Let I = {r1,r2,...,rn} be a set of n arbitrary positive integers and the values in I are distinct. I is not given in any sorted order. Suppose we want to find a subset I' of I such that the total sum of all elements in I' is exactly 100*ceil(n^.5) (each element of I can appear at most once in I'). Present an O(n) time algorithm for solving this problem.

As far as I can tell, it's basically a special case of the knapsack problem, otherwise known as the subset-sum problem ... both of which are in NP and in theory impossible to solve in linear time?

So ... was this a trick question?


This SO post basically explains that a pseudo-polynomial (linear) time approximation can be done if the weights are bounded, but in the exam problem the weights aren't bounded and either way given the overall difficulty of the exam I'd be shocked if the prof expected us to know/come up with an obscure dynamic optimization algorithm.

like image 986
cjhin Avatar asked Dec 17 '13 05:12

cjhin


People also ask

What is the time complexity of sum of subset problem?

Time Complexity: O(N * sum) where N is the size of the array. Space Complexity: O(N * sum) where N is the size of the array.

Can subset sum be solved in polynomial time?

Recall that the classical subset sum problem with known weights αi's can be solved in polynomial time by a lattice based algorithm [LO85], when the density d = n/log M is O(1/n).

What is sum of subset problem explain with example?

Statement: Given a set of positive integers, and a value sum, determine that the sum of the subset of a given set is equal to the given sum. Given an array of integers and a sum, the task is to have all subsets of given array with sum equal to the given sum. Subset {4, 5} has the sum equal to 9.


1 Answers

There are two things that make this problem possible:

  1. The input can be truncated to size O(sqrt(n)). There are no negative inputs, so you can discard any numbers greater than 100*sqrt(n), and all inputs are distinct so we know there are at most 100*sqrt(n) inputs that matter.
  2. The playing field has size O(sqrt(n)). Although there are O(2^sqrt(n)) ways to combine the O(sqrt(n)) inputs that matter, you don't have to care about combinations that either leave the 100*sqrt(n) range or redundantly hit a target you can already reach.

Basically, this problem screams dynamic programming with each input being checked against each part of the 'reached number' space somehow.

The solution ends up being a matter of ensuring numbers don't reach off of themselves (by scanning in the right direction), of only looking at each number once, and of giving ourselves enough information to reconstruct the solution afterwards.

Here's some C# code that should solve the problem in the given time:

int[] FindSubsetToImpliedTarget(int[] inputs) {
    var target = 100*(int)Math.Ceiling(Math.Sqrt(inputs.Count));

    // build up how-X-was-reached table
    var reached = new int?[target+1];
    reached[0] = 0; // the empty set reaches 0
    foreach (var e in inputs) {
        // we go backwards to avoid reaching off of ourselves
        for (var i = target; i >= e; i--) {
            if (reached[i-e].HasValue) {
                reached[i] = e;
            }
        }
    }

    // was target even reached?
    if (!reached[target].HasValue) return null;

    // build result by back-tracking via the logged reached values
    var result = new List<int>();
    for (var i = target; reached[i] != 0; i -= reached[i].Value) {
        result.Add(reached[i].Value);
    }
    return result.ToArray();
}

I haven't actually tested the above code, so beware typos and off-by-ones.

like image 174
Craig Gidney Avatar answered Oct 06 '22 00:10

Craig Gidney