Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

given an array of sorted ints, find the most frequently occuring element in log(n)

It's easy to find the most frequently occurring element in O(n). Is there a faster algorithm (O(logn)) to do this? (given the array is sorted)

like image 312
One Two Three Avatar asked Oct 23 '17 17:10

One Two Three


People also ask

How do you find most frequent items in an array?

Solution Steps We initialize two variables: maxFreq to track the maximum frequency and mostFrequent to track the most frequent element. maxFreq = 0, mostFrequent = -1. Now we scan sorted array using a loop till i < n. Inside loop, we initialize variable countFreq to track the frequency count i.e. countFreq = 1.


3 Answers

It is impossible. Below is not a strict proof (strict lower-bound proofs are hard in general) but a sane reasoning.

Assume your array always looks like 1, 2, 3, 4, 5, 6, ..., n. Then you replace some number with occurrence of a previous number: 1, 2, 3, 3, 5, ... n. Now in the new array a[i] = i for all i except for one position.

In order to find the most frequent element you must examine all positions and check for irregularity there. Note that there is exactly one irregularity, and you can say nothing about its position if you look at any other element of the array. Thus this problem is not easier than finding a one in a boolean array of ones and zeroes, which requires linear time.

like image 132
Ivan Smirnov Avatar answered Oct 27 '22 10:10

Ivan Smirnov


Not O(logn) but if the number of distinct integers is less, you can solve it in O(mlogn) where m is the total number of distinct integers.

It must be noted that this solution will only be fruitful if m << n.

The idea is to start from index 0 and find the last occurrence of the same number in the sorted array, which can be done using binary search by skipping with delta d, as your interviewer said and increasing this delta every time, until you find the last number.

On finding that, you can have another variable maxCount which was initialized to 0 in the starting. Check if endIndex - startIndex > maxCount and if yes, replace maxCount with endIndex - startIndex. Now, repeat the same process starting from startIndex+1.

As @ivan has mentioned above, this will fail terribly and would give a O(n) solution if all the numbers are distinct.

like image 3
Parijat Purohit Avatar answered Oct 27 '22 11:10

Parijat Purohit


This Python code makes it in O(mlogn) time based on @Parijat's answer.

import bisect

def most_frequent_in_sorted(lst):
    most_frequent = None
    max_frequency = 0
    n = len(lst)
    idx = 0

    while idx < n:
        # Get leftmost index holding an element != lst[idx]
        next_leftmost_idx = bisect.bisect_right(lst, lst[idx])

        # Update most frequent element
        cur_frequency = next_leftmost_idx - idx
        if cur_frequency > max_frequency:
            most_frequent = lst[idx]
            max_frequency = cur_frequency

        # Update index to hold next different integer
        idx = next_leftmost_idx

    return most_frequent
like image 3
MROB Avatar answered Oct 27 '22 11:10

MROB