Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum-subset with a fixed subset size

The sum-subset problem states:

Given a set of integers, is there a non-empty subset whose sum is zero?

This problem is NP-complete in general. I'm curious if the complexity of this slight variant is known:

Given a set of integers, is there a subset of size k whose sum is zero?

For example, if k = 1, you can do a binary search to find the answer in O(log n). If k = 2, then you can get it down to O(n log n) (e.g. see Find a pair of elements from an array whose sum equals a given number). If k = 3, then you can do O(n^2) (e.g. see Finding three elements in an array whose sum is closest to a given number).

Is there a known bound that can be placed on this problem as a function of k?

As motivation, I was thinking about this question How do you partition an array into 2 parts such that the two parts have equal average? and trying to determine if it is actually NP-complete. The answer lies in whether or not there is a formula as described above.

Barring a general solution, I'd be very interested in knowing an optimal bound for k=4.

like image 850
PengOne Avatar asked Jan 18 '12 19:01

PengOne


2 Answers

For k=4, space complexity O(n), time complexity O(n2 * log(n))

Sort the array. Starting from 2 smallest and 2 largest elements, calculate all lesser sums of 2 elements (a[i] + a[j]) in the non-decreasing order and all greater sums of 2 elements (a[k] + a[l]) in the non-increasing order. Increase lesser sum if total sum is less than zero, decrease greater one if total sum is greater than zero, stop when total sum is zero (success) or a[i] + a[j] > a[k] + a[l] (failure).

The trick is to iterate through all the indexes i and j in such a way, that (a[i] + a[j]) will never decrease. And for k and l, (a[k] + a[l]) should never increase. A priority queue helps to do this:

  1. Put key=(a[i] + a[j]), value=(i = 0, j = 1) to priority queue.
  2. Pop (sum, i, j) from priority queue.
  3. Use sum in the above algorithm.
  4. Put (a[i+1] + a[j]), i+1, j and (a[i] + a[j+1]), i, j+1 to priority queue only if these elements were not already used. To keep track of used elements, maintain an array of maximal used 'j' for each 'i'. It is enough to use only values for 'j', that are greater, than 'i'.
  5. Continue from step 2.

For k>4

If space complexity is limited to O(n), I cannot find anything better, than use brute force for k-4 values and the above algorithm for the remaining 4 values. Time complexity O(n(k-2) * log(n)).

For very large k integer linear programming may give some improvement.

Update

If n is very large (on the same order as maximum integer value), it is possible to implement O(1) priority queue, improving complexities to O(n2) and O(n(k-2)).

If n >= k * INT_MAX, different algorithm with O(n) space complexity is possible. Precalculate a bitset for all possible sums of k/2 values. And use it to check sums of other k/2 values. Time complexity is O(n(ceil(k/2))).

like image 182
Evgeny Kluev Avatar answered Nov 13 '22 14:11

Evgeny Kluev


The problem of determining whether 0 in W + X + Y + Z = {w + x + y + z | w in W, x in X, y in Y, z in Z} is basically the same except for not having annoying degenerate cases (i.e., the problems are inter-reducible with minimal resources).

This problem (and thus the original for k = 4) has an O(n^2 log n)-time, O(n)-space algorithm. The O(n log n)-time algorithm for k = 2 (to determine whether 0 in A + B) accesses A in sorted order and B in reverse sorted order. Thus all we need is an O(n)-space iterator for A = W + X, which can be reused symmetrically for B = Y + Z. Let W = {w1, ..., wn} in sorted order. For all x in X, insert a key-value item (w1 + x, (1, x)) into a priority queue. Repeatedly remove the min element (wi + x, (i, x)) and insert (wi+1 + x, (i+1, x)).

like image 4
Gina Avatar answered Nov 13 '22 12:11

Gina