Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I calculate the k nearest numbers to the median?

I have an array of n pairwise different elements and a number k with 1<=k<=n.

Now I am looking for an algorithm calculating the k numbers with the minimal absolute difference to the median of the number array. I need linear complexity (O(n)).

My approach:

I find the median:

  • I sort the number
  • I get the middle element or if the number of elements id even then the average of the two elements in the middle and round.

After that:

  • I take every number and find the absolute distance from the median. These results I save in a different array
  • I sort the newly obtained array.
  • I take the first k elements of the result array and I'm done.

I don't know if my solution is in O(n), also whether I'm right with this idea. Can someone verify that? Can someone show me how to solve it in O(n)?

like image 819
Rat565 Avatar asked Jan 14 '23 12:01

Rat565


1 Answers

You can solve your problem like that:

You can find the median in O(n), w.g. using the O(n) nth_element algorithm.

You loop through all elements substutiting each with a pair: <the absolute difference to the median>, <element's value>. Once more you do nth_element with n = k. after applying this algorithm you are guaranteed to have the k smallest elements in absolute difference first in the new array. You take their indices and DONE!

Your algorithm, on the other hand uses sorting, and this makes it O(nlogn).

EDIT: The requested example:

Let the array be [14, 6, 7, 8, 10, 13, 21, 16, 23].

  • After the step for finding the median it will be reordered to, say: [8, 7, 9, 10, 13, 16, 23, 14, 21], notice that the array is not sorted, but still the median (13) is exactly in the middle.
  • Now let's do the pair substitution that got you confused: we create a new array of pairs: [<abs(14-13), 14>, <abs(6-13), 6>, <abs(7-13), 7>, <abs(8-13), 8>, <abs(10-13), 10>, <abs(13-13), 13>, <abs(21-13), 21>, <abs(16-13), 16>, <abs(23-13), 23>. Thus we obtain the array: [<1, 14>, <7, 6>, <6, 7>, <5, 8>, <3, 10>, <0, 13>, <8, 21>, <3, 16>, <10, 23>
  • If e.g. k is 4 we make once more nth_element(using the first element of each pair for comparisons) and obtain: [<1, 14>, <3, 16>, <0, 13>, <3, 10>, <8, 21>, <7, 6>, <10, 23>, <6, 7>, <5, 8>] so the numbers you search for are the second elements of the first 4 pairs: 14, 16, 13 and 10
like image 97
Boris Strandjev Avatar answered Jan 28 '23 03:01

Boris Strandjev