Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find k most occurring elements in an integer array

Given an array A with possible duplicate entries, find the k entries that occur most frequently.

My approach:

Create a MinHeap of k most occurring elements ordered by the frequency. top element obviously being least occurring of rest of the elements. Create a HashMap to keep track of all element counts and whether or not they are in MinHeap.

When reading a new integer:

  • check if it is in HashMap: Increment the count in HashMap
  • also if it is check if it is in Heap :then Increment the count there also and heapify.
  • if not then compare with root element count and remove the root to add this if necessary. Then heapify.

In the end return MinHeap as desired output.

class Wrapper{
 boolean inHeap;
 int count;
}

This would take O(n+k) space and O(n log k) time complexity. Is there a better way to do space and/or time complexity wise.

like image 372
m0nish Avatar asked Jun 04 '14 01:06

m0nish


People also ask

How do you find the top k element in an array?

Method 1 (Use Bubble k times) 1) Modify Bubble Sort to run the outer loop at most k times. 2) Print the last k elements of the array obtained in step 1. Like Bubble sort, other sorting algorithms like Selection Sort can also be modified to get the k largest elements.

How do you count occurrences of an element in an array?

The for loop is one of the standard methods to loop through an array. It allows us to loop over each element of an array and compare it to the element we are looking for. That way, we can count the number of occurrences of that element in an array.


1 Answers

We can say the space complexity of your approach is O(n), since you can never use more than O(2n) = O(n) memory.


Skip the heap and just create the HashMap.

After you've created the HashMap, you can iterate through it and put all the elements in an array.

Then you can run a selection algorithm such as quickselect on the array to get k-th element, and the first k elements from there (the extension to extract the first k elements via quickselect is fairly trivial, or you can just iterating through again to get them).

Then you sort the k elements, if required.

The running time would be expected O(n) or O(n + k log k) if sorting is required.

The space complexity would be O(n).

like image 145
Bernhard Barker Avatar answered Nov 04 '22 01:11

Bernhard Barker