Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest algorithm to find an element with highest frequency in an array

I have two input arrays X and Y. I want to return that element of array X which occurs with highest frequency in array Y.

The naive way of doing this requires that for each element x of array X, I linearly search array Y for its number of occurrences and then return that element x which has highest frequency. Here is the pseudo algorithm:

 max_frequency = 0
 max_x = -1             // -1 indicates no element found
 For each x in X
     frequency = 0
     For each y in Y
         if y == x
             frequency++
     End For
     If frequency > max_frequency
         max_frequency = frequency
         max_x = x
     End If
 End For
 return max_x

As there are two nested loops, time complexity for this algorithm would be O(n^2). Can I do this in O(nlogn) or faster ?

like image 810
nurabha Avatar asked Mar 23 '13 21:03

nurabha


2 Answers

Use a hash table mapping keys to counts. For each element in the array, do like counts[element] = counts[element] + 1 or your language's equivalent.

At the end, loop through the mappings in the hash table and find the max.

like image 157
cHao Avatar answered Sep 28 '22 01:09

cHao


Alternatively, if you can have additional data structures, you walk the array Y, for each number updating its frequency in a hash table. This takes O(N(Y) time. Then walk X finding which element in X has highest frequency. This takes O(N(X)) time. Overall: linear time, and since you have to look at each element of both X and Y in any implementation at least once (EDIT: This is not strictly speaking true in all cases/all implementations, as jwpat7 points out, though it true in the worst case), you can't do it any faster than that.

like image 23
angelatlarge Avatar answered Sep 28 '22 01:09

angelatlarge