Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting median out of frequency table (counting sort)

I can't understand the logic behind getMedian method. How exactly median is evaluated, what is the connection between count of elements and sum of elements? Appreciate if someone could explain it's logic.

  public static void main(String[] args) {
            Random r = new Random();
            int[] ar = r.ints(0, 100).limit(9).toArray();
            int k = ar.length;
            int[] count = getCounts(ar);
            double median = getMedian(count, k);
            System.out.println(median);
        }

        private static int[] getCounts(int[] ar) {
            int[] count = new int[100];
            for (int i = 0; i < ar.length; i++) {
                count[ar[i]]++;
            }
            return count;
        }

        private static double getMedian(int[] count, int d) {
            int sum = 0;
            for (int i = 0; i < count.length; i++) {
                sum += count[i];
                if (2 * sum < d)
                    continue;
                else if (2 * sum == d)
                    return (2 * i + 1) / 2.0;
                else
                    return i * 1.0;
            }
            return -1.0;
        }
like image 515
joe.d Avatar asked Aug 26 '17 02:08

joe.d


1 Answers

There is a relation because it is a frequency table. You are thinking it differently but let me give you an example.

1 1 1 3 3 4 4 4 5 5 5 5 if this is the array then the frequency table would be :-

1 3 4 5
- - - -
3 2 3 4

So this is median. So now I am adding every element count and asking us the question, where does the median lie? or where is that indexm which if I consider I will cover the middle element?

Now here I am checking if sum > d/2 then it's done. We found the median.else if it is less then I still have to traverse other elements to get to the middle of the array. And if it is sum==d/2 then we have found it but we have to send the correct position. And we simply send the one in the lower middle (happens in case like 1,1,1,1).

Walk through

1 1 1 3 3 4 4 4 5 5 5 5

Now I check if I traverse all set of 1's where I am? I covered 3 elements. But it's not the half of the total numbers(6).

Now add number of 3's. 5. This is also not.

Now I add number of 4's, So 8 elements I covered. So I covered more than half of number of elements. So median lies here.

More detailed explanation:

You are asked to find the median of an array of 10 integers.

[1 2 3 4 5 6 7 8 9]

Then median is in element at position floor(9/2)=4, which is 5 Right?

[1 1 2 2 3 3 4 4 5]

Where is the median element at position floor(9/2)=4, which is 3. Right?

So now think this,

 1  2 3 4 5
 2  2 2 2 1

Now you will try to find the floor(9/2) th element here starting from beginning. And that's why you need to find the sum of the frequencies and all.

Hope you get it?

Correct algorithm

What you need to do is :-

N = number of elements.
F[] = frequency array
so if N is odd
  find the element at floor(N/2)-th place and median is that element.
else
  find the element at floor((N-1)/2) and floor(N/2) th position and return their average.

Finding the element is simple:
Find( F[], p) // find the element at position p
{
  p=p+1
  for i in [0..|F|]
    cumulative+=F[i]
    if cumulative == p
      return this element.
    else cumulative >p
      return this element  
}
like image 159
user2736738 Avatar answered Oct 15 '22 12:10

user2736738