Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to calculate the maximum median in an array

This is an algorithm question:

Input is an array with non-duplicate positive integers. Find a continuous subarray(size > 1) which has the maximum median value.

Example: input: [100, 1, 99, 2, 1000], output should be the result of (1000 + 2) / 2 = 501

I can come up the brute force solution: try all lengths from 2 -> array size to find the maximum median. But it seems too slow. I also tried to use two pointer on this question but not sure when to move left and right pointer.

Anyone has a better idea to solve this question?

like image 962
CipherText Avatar asked Apr 21 '19 19:04

CipherText


People also ask

How do you maximize the median of an array?

Approach: In order to maximize the median of the resultant array, all the elements that need to be inserted must be greater than the maximum element from the array. After inserting these elements, the new size of the array will be size = N + K.

How do you find the median of an array?

To find median: First, simply sort the array. Then, check if the number of elements present in the array is even or odd. If odd, then simply return the mid value of the array. Else, the median is the average of the two middle values.


2 Answers

tl;dr - We can show that the answer must be of length 2 or 3, after which it's linear time to check all the possibilities.

Let's say the input is A and the smallest subarray with the biggest median is a. The biggest median is either a single element or the average of a pair of elements from a. Notice that every element in a bigger than the largest element of the median can only be next to elements less than the smallest element of the median (otherwise such a pair could be chosen as a subarray to form a bigger median).

If either end of a had a pair of elements that didn't include an element of the median, it could be eliminated from a without affecting the median, a contradiction.

If either end of a was smaller than the smallest element of the median, eliminating it would increase the median, a contradiction.

Thus each end of a is either an element of the median or larger than the largest element of the median (because it's larger than the smallest elt of the median and not equal to the largest elt of the median).

Thus each end of a is an element of the median because otherwise, we'd have an element larger than an element of the median adjacent to an elt of the median, forming a larger median.

If a is odd then it must be of length three, since any larger odd length could have 2 removed from the end of a farthest from the median without changing the median.

If a is even then it must be of length 2 because any larger even length bookended by the elements of the median with interior elements alternating between smaller and larger than the median must have one of the median elements adjacent to a larger element than the other elt of the median, forming a larger median.

This proof outline could use some editing, but regardless, the conclusion is that the smallest array containing the largest median must be of length 2 or 3.

Given that, check every such subarray in linear time. O(n).

like image 166
Dave Avatar answered Oct 25 '22 05:10

Dave


This is a Python implementation of an algorithm that solves the problem in O(n):

import random
import statistics

n = 50
numbers = random.sample(range(n),n)

max_m = 0;
max_a = [];

for i in range(2,3):
    for j in range(0,n-i+1):
        a = numbers[j:j+i]
        m = statistics.median(a)
        if m > max_m:
            max_m = m
            max_a = a

print(numbers)
print(max_m)
print(max_a)

This is a variation of the brute force algorithm (O(n^3)) that performs only the search for sub-arrays of length 2 or 3. The reason is that for every array of size n, there exists a sub-array that has the same or improved median. Applying this reasoning recursively, we can reduce the size of the sub-array to 2 or 3. Thus, by looking only at sub-arrays of size 2 or 3, we are guaranteed to obtain the sub-array with the maximum median.

The operation is the following: If, for a contiguous sub-array (at the beginning or at the end), at least half of the elements are lower than the median (or lower than both values forming the median, if this is the case), remove them to improve or at least preserve the median.

If in all sub-arrays there is always at least one more element above or equal to the median(s) than below, there will come a point where the size of the sub-array will be that of the median. In that case, it means that the complement will have more elements below the median, and thus, we can simply remove the complement and improve (or preserve) the median. Thus, we can always perform the operation. For n=3, it can happen that you need to remove 2 or 3 elements to perform the operation, which is not allowed. In this case, the result is the list itself.

like image 32
Gilles-Philippe Paillé Avatar answered Oct 25 '22 04:10

Gilles-Philippe Paillé