Given two arrays find the index k that minimizes the sum A[i]*|B[i]-B[k]|

Question

I am given two arrays that contains natural numbers , A and B , and I need to find the index k that minimizes the sum A[i] * |B[i]-B[k]| from i=0 to n-1. (Both arrays have the same length) Its obviously easy to do in O(n^2) , I just calculate all sums for all k between 0 and n-1, but I need a better run time complexity.

Any ideas? Thanks!

Peter de Rivaz · Accepted Answer

You can do this in time O(nlogn) by first sorting both arrays based on the values in B, and then performing a single scan.

Once the arrays are sorted, then B[i]>=B[k] if i>k and B[i]<=B[k] if i<= k, so the sum can be rewritten as:

sum A[i] * abs(B[i]-B[k]) = sum A[i]*(B[i]-B[k])  for i=k..n-1
                            + sum A[i]*(B[k]-B[i])  for i=0..k-1

    = sum A[i]*B[i] for i=k..n-1
      - B[k] * sum A[i] for i=k..n-1
      + B[k] * sum A[i] for i = 0..k-1
      - sum A[i]*B[i] for i = 0..k-1

You can precalculate all of the sums in time O(n) which then lets you evaluate the target sum at every position in O(n) and select the value for k which gives the best score.

Nemo · Answer

I believe I can do this is O(n log n).

First, sort the B array, applying the same permutation to the A array (and remembering the permutation). This is the O(n log n) part. Since we sum over all i, applying the same permutation to the A and B arrays does not change the minimum.

With a sorted B array, the rest of the algorithm is actually O(n).

For each k, define an array C_k[i] = |B[i] - B[k]|

(Note: We will not actually construct C_k... We will just use it as a concept for easier reasoning.)

Observe that the quantity we are trying to minimize (over k) is the sum of A[i] * C_k[i]. Let's go ahead and give that a name:

Define: S_k = Σ A[i] * C_k[i]

Now, for any particular k, what does C_k look like?

Well, C_k[k] = 0, obviously.

More interestingly, since the B array is sorted, we can get rid of the absolute value signs:

C_k[i] = B[k] - B[i], for 0 <= i < k
C_k[i] = 0, for i = k
C_k[i] = B[i] - B[k], for k < i < n

Let's define two more things.

Definition: T_k = Σ A[i] for 0 <= i < k

Definition: U_k = Σ A[i] for k < i < n

(That is, T_k is the sum of the first k-1 elements of A. U_k is the sum of all but the first k elements of A.)

The key observation: Given S_k, T_k, and U_k, we can compute S_k+1, T_k+1, and U_k+1 in constant time. How?

T and U are easy.

The question is, how do we get from S_k to S_k+1?

Consider what happens to C_k when we go to C_k+1. We simply add B[k+1]-B[k] to every element of C from 0 to k, and we subtract the same amount from every element of C from k+1 to n (prove this). That means we just need to add T_k * (B[k+1] - B[k]) and subtract U_k * (B[k+1] - B[k]) to get from S_k to S_k+1.

Algebraically... The first k terms of S_k are just the sum from 0 to k-1 of A[i] * (B[k] - B[i]).

The first k terms of S_k+1 are the sum from 0 to k-1 of A[i] * (B[k+1] - B[i])

The difference between these is the sum, from 0 to k-1, of (A[i] * (B[k+1] - B[i]) - (A[i] * (B[k] - B[i])). Factor out the A[i] terms and cancel the B[i] terms to get the sum from 0 to k-1 of A[i] * (B[k+1] - B[k]), which is just T_k * (B[k+1] - B[k]).

Similarly for the last n-k-1 terms of S_k.

Since we can compute S₀, T₀, and U₀ in linear time, and we can go from S_k to S_k+1 in constant time, we can calculate all of the S_k in linear time. So do that, remember the smallest, and you are done.

Use the inverse of the sort permutation to get the k for the original arrays.

Given two arrays find the index k that minimizes the sum A[i]*|B[i]-B[k]|

Tags:

arrays

algorithm

data-structures

John Smith

2 Answers

Peter de Rivaz

Nemo

Recent Activity

Donate For Us

Given two arrays find the index k that minimizes the sum A[i]*|B[i]-B[k]|

Tags:

arrays

algorithm

data-structures

John Smith

2 Answers

Peter de Rivaz

Nemo

Related questions

Recent Activity

Donate For Us