It's easy to find the most frequently occurring element in O(n). Is there a faster algorithm (O(logn)) to do this? (given the array is sorted)
Solution Steps We initialize two variables: maxFreq to track the maximum frequency and mostFrequent to track the most frequent element. maxFreq = 0, mostFrequent = -1. Now we scan sorted array using a loop till i < n. Inside loop, we initialize variable countFreq to track the frequency count i.e. countFreq = 1.
It is impossible. Below is not a strict proof (strict lower-bound proofs are hard in general) but a sane reasoning.
Assume your array always looks like 1, 2, 3, 4, 5, 6, ..., n
. Then you replace some number with occurrence of a previous number: 1, 2, 3, 3, 5, ... n
. Now in the new array a[i] = i
for all i
except for one position.
In order to find the most frequent element you must examine all positions and check for irregularity there. Note that there is exactly one irregularity, and you can say nothing about its position if you look at any other element of the array. Thus this problem is not easier than finding a one
in a boolean array of ones and zeroes, which requires linear time.
Not O(logn)
but if the number of distinct integers is less, you can solve it in O(mlogn)
where m is the total number of distinct integers.
It must be noted that this solution will only be fruitful if m << n
.
The idea is to start from index 0 and find the last occurrence of the same number in the sorted array, which can be done using binary search by skipping with delta d, as your interviewer said and increasing this delta every time, until you find the last number.
On finding that, you can have another variable maxCount
which was initialized to 0 in the starting. Check if endIndex - startIndex > maxCount
and if yes, replace maxCount
with endIndex - startIndex
. Now, repeat the same process starting from startIndex+1
.
As @ivan has mentioned above, this will fail terribly and would give a O(n) solution if all the numbers are distinct.
This Python code makes it in O(mlogn)
time based on @Parijat's answer.
import bisect
def most_frequent_in_sorted(lst):
most_frequent = None
max_frequency = 0
n = len(lst)
idx = 0
while idx < n:
# Get leftmost index holding an element != lst[idx]
next_leftmost_idx = bisect.bisect_right(lst, lst[idx])
# Update most frequent element
cur_frequency = next_leftmost_idx - idx
if cur_frequency > max_frequency:
most_frequent = lst[idx]
max_frequency = cur_frequency
# Update index to hold next different integer
idx = next_leftmost_idx
return most_frequent
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With