Getting the lowest possible sum from numbers' difference

Question

I have to find the lowest possible sum from numbers' difference.

Let's say I have 4 numbers. 1515, 1520, 1500 and 1535. The lowest sum of difference is 30, because 1535 - 1520 = 15 && 1515 - 1500 = 15 and 15 + 15 = 30. If I would do like this: 1520 - 1515 = 5 && 1535 - 1500 = 35 it would be 40 in sum.

Hope you got it, if not, ask me.

Any ideas how to program this? I just found this online, tried to translate from my language to English. It sounds interesting. I can't do bruteforce, because it would take ages to compile. I don't need code, just ideas how to program or little fragment of code.

Thanks.

Edit: I didn't post everything... One more edition:

I have let's say 8 possible numbers. But I have to take only 6 of them to make the smallest sum. For instance, numbers 1731, 1572, 2041, 1561, 1682, 1572, 1609, 1731, the smallest sum will be 48, but here I have to take only 6 numbers from 8.

moinudin · Accepted Answer

Taking the edit into account:

Start by sorting the list. Then use a dynamic programming solution, with state i, n representing the minimum sum of n differences when considering only the first i numbers in the sequence. Initial states: dp[*][0] = 0, everything else = infinity. Use two loops: outer loop looping through i from 1 to N, inner loop looping through n from 0 to R (3 in your example case in your edit - this uses 3 pairs of numbers which means 6 individual numbers). Your recurrence relation is dp[i][n] = min(dp[i-1][n], dp[i-2][n-1] + seq[i] - seq[i-1]).

You have to be aware of handling boundary cases which I've ignored, but the general idea should work and will run in O(N log N + NR) and use O(NR) space.

ShreevatsaR · Answer

The solution by marcog is a correct, non-recursive, polynomial-time solution to the problem — it's a pretty standard DP problem — but, just for completeness, here's a proof that it works, and actual code for the problem. [@marcog: Feel free to copy any part of this answer into your own if you wish; I'll then delete this.]

Proof

Let the list be x₁, …, x_N. Assume wlog that the list is sorted. We're trying to find K (disjoint) pairs of elements from the list, such that the sum of their differences is minimised.

Claim: An optimal solution always consists of the differences of consecutive elements.
Proof: Suppose you fix the subset of elements whose differences are taken. Then by the proof given by Jonas Kölker, the optimal solution for just this subset consists of differences of consecutive elements from the list. Now suppose there is a solution corresponding to a subset that does not comprise pairs of consecutive elements, i.e. the solution involves a difference x_j-x_i where j>i+1. Then, we can replace x_j with x_i+1 to get a smaller difference, since
x_i ≤ x_i+1 ≤ x_j ⇒ x_i+1-x_i ≤ x_j-x_i.
(Needless to say, if x_i+1=x_j, then taking x_i+1 is indistinguishable from taking x_j.) This proves the claim.

The rest is just routine dynamic programming stuff: the optimal solution using k pairs from the first n elements either doesn't use the nth element at all (in which case it's just the optimal solution using k pairs from the first n-1), or it uses the nth element in which case it's the difference x_n-x_n-1 plus the optimal solution using k-1 pairs from the first n-2.

The whole program runs in time O(N log N + NK), as marcog says. (Sorting + DP.)

Code

Here's a complete program. I was lazy with initializing arrays and wrote Python code using dicts; this is a small log(N) factor over using actual arrays.

'''
The minimum possible sum|x_i - x_j| using K pairs (2K numbers) from N numbers
'''
import sys
def ints(): return [int(s) for s in sys.stdin.readline().split()]

N, K = ints()
num = sorted(ints())

best = {} #best[(k,n)] = minimum sum using k pairs out of 0 to n
def b(k,n):
    if best.has_key((k,n)): return best[(k,n)]
    if k==0: return 0
    return float('inf')

for n in range(1,N):
    for k in range(1,K+1):
        best[(k,n)] = min([b(k,n-1),                      #Not using num[n]
                           b(k-1,n-2) + num[n]-num[n-1]]) #Using num[n]

print best[(K,N-1)]

Test it:

Input
4 2
1515 1520 1500 1535
Output
30

Input
8 3
1731 1572 2041 1561 1682 1572 1609 1731
Output
48

Jonas Kölker · Answer

I assume the general problem is this: given a list of 2n integers, output a list of n pairs, such that the sum of |x - y| over all pairs (x, y) is as small as possible.

In that case, the idea would be:

sort the numbers
emit (numbers[2k], numbers[2k+1]) for k = 0, ..., n - 1.

This works. Proof:

Suppose you have x_1 < x_2 < x_3 < x_4 (possibly with other values between them) and output (x_1, x_3) and (x_2, x_4). Then

|x_4 - x_2| + |x_3 - x_1| = |x_4 - x_3| + |x_3 - x_2| + |x_3 - x_2| + |x_2 - x_1| >= |x_4 - x_3| + |x_2 - x_1|.

In other words, it's always better to output (x_1, x_2) and (x_3, x_4) because you don't redundantly cover the space between x_2 and x_3 twice. By induction, the smallest number of the 2n must be paired with the second smallest number; by induction on the rest of the list, pairing up smallest neighbours is always optimal, so the algorithm sketch I proposed is correct.

Nicolas · Answer

Order the list, then do the difference calculation.

EDIT: hi @hey

You can solve the problem using dynamic programming.

Say you have a list L of N integers, you must form k pairs (with 2*k <= N)

Build a function that finds the smallest difference within a list (if the list is sorted, it will be faster ;) call it smallest(list l)

Build another one that finds the same for two pairs (can be tricky, but doable) and call it smallest2(list l)

Let's define best(int i, list l) the function that gives you the best result for i pairs within the list l

The algorithm goes as follows:

best(1, L) = smallest(L)
best(2, L) = smallest2(L)
for i from 1 to k:

loop

compute min ( 
    stored_best(i-2) - smallest2( stored_remainder(i-2) ),
    stored_best(i-1) - smallest( stored_remainder(i-1) 
) and store as best(i)
store the remainder as well for the chosen solution

Now, the problem is once you have chosen a pair, the two ints that form the boundaries are reserved and can't be used to form a better solution. But by looking two levels back you can guaranty you have allowed switching candidates.

(The switching work is done by smallest2)

Getting the lowest possible sum from numbers' difference

Tags:

language-agnostic

algorithm

good_evening

4 Answers

moinudin

Proof

Code

ShreevatsaR

Jonas Kölker

Nicolas

Recent Activity

Donate For Us

Getting the lowest possible sum from numbers' difference

Tags:

language-agnostic

algorithm

good_evening

4 Answers

moinudin

Proof

Code

ShreevatsaR

Jonas Kölker

Nicolas

Related questions

Recent Activity

Donate For Us