Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Smallest sum of difference between elements in two lists

Say we have two lists of the same length, ls1 and ls2. For example, we have

ls1 = [4, 6]
ls2 = [3, 5]

and for each element in ls1 we have to pair it with one element with one element in ls2, in a manner such that the total (absolute) difference between the element in ls1 and the element in ls2 is minimal. One element can only be matched once. In the example above, the optimal way is to match 4 is ls1 with 3 in ls2, and 5 in ls1 with 6 in ls2, which yields a total difference of

(4 - 3) + (6 - 5) = 2 

I need a program to return this minimum total difference between the elements in these two lists. The length of the lists is arbitrary, and so is the values of the elements in the lists (but they are all positive integers).

I currently know that using permutation to brute force the solution is an option, but what I need is a code that has optimal time and space complexity. I heard of the concept of dynamic programming, but I have no idea of how to implement it in my case. Thanks in advance for replies.

Ps. Here's my current brute force code using permutation, which is not efficient in runtime or memory usage:

from itertools import permutations

def minimum_total_difference(ls1, ls2):
    length = len(ls1)
    possibilities = list(permuations(ls1, length))
    out = 10**10
    for possibility in possibilities:
        out_ = 0
        for _ in range(length):
            diff = abs(possibility[_] - ls2[_])
            out += diff
        if out_ < out:
            out = out_
    return out
like image 734
Jingjie YANG Avatar asked Jan 04 '23 21:01

Jingjie YANG


1 Answers

One can prove that the optimal solution is to sort both lists and match their elements in sorted order.

A sketch of the proof:

  1. Let there be an inversion, that is a is matched to b, c is matched to d and a < c, b > d.

  2. We can "exchange" these elements: a->d, c->b. Now a < c, d < b. One can show that this operation never increases the answer (by considering all possible relative values of a, b, c, d)

  3. Thus, there is always an optimal matching where both lists are sorted.

Here's an efficient one-liner that implements this solution:

sum(abs(x - y) for x, y in zip(sorted(xs), sorted(ys)))
like image 194
kraskevich Avatar answered Jan 14 '23 07:01

kraskevich