Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to figure out the nearest sum of two sub array in 2N array? Consider to find optimal one

Here's my problem.

There's one unsorted array with 2N elements. All these elements are positive integer. Q: How to split this array into two N array and the sum of two array must be most close to each other

One intuitive thought is,

  1. sort this array to a1< a2 < a3< ...< a2N and
  2. split them to two sub array a1 a3 a5 ... a(2N-1) and [a2,a4,...a2N] , then swith two number in each aray, and keep looping util we find the smallest one between the two array.

But in this way, we can't sure that we find the optimal one.

like image 376
Ivan Avatar asked Dec 01 '25 09:12

Ivan


2 Answers

This is the Partition Problem, and it's hard (i.e. NP-complete).

That said, if you need to implement this, then the greedy algorithm you suggest does a decent job. You can do a little better by ensuring that one list isn't always less than the other by taking 1 from the first, 2 from the second, 2 from the first, ..., 1 from the second:

A = [a1, a4, a5, a8, a9, ..., a(n-2), a(n-1)]
B = [a2, a3, a6, a7, ..., a(n-4), a(n-3), an]

This does not always give an optimal solution, but it does well enough for most cases. It also keeps out the bias for "going first".

like image 185
PengOne Avatar answered Dec 02 '25 21:12

PengOne


You should look up the subset sum problem. In your case you are shooting for a subset sum of S/2 where S is the the full set sum. There is a simple dynamic programming algorithm to do it in the case of integers, which is the best known. Unfortunately it's pseudo-polynomial time. This means that run time is a polynomial in the size of the elements. This makes it exponential time in the normal sense, but it works fine if the elements are not too large.

The subset sum dynamic program needs a little modification to enforce the requirement for exactly N elements e[i]. Let Q(i, s, n) be true iff there exists a subset consisting of elements selected from 1 to i summing to s and containing exactly n<=i elements.

Then

Q(i, s, n) = Q(i - 1, s, n) or Q(i - 1, s - e[i], n - 1)

The base case says that using no elements at all, the sum and the required number in the subset must both be zero, otherwise Q is false:

Q(0, 0, 0) = true, otherwise Q(0, _, _) is false.

To get the answer, compute the DP table for Q(2N, k, N), k = ceiling(S/2), ceiling(S/2)-1, ... until you find a true value.

Note that this problem is close enough to subset sum to make it NP hard. This means a true polynomial time algorithm (like your sorting proposal) will be an approximation of optimality. Of course that may well be just fine for realistic purposes.

like image 33
Gene Avatar answered Dec 02 '25 21:12

Gene