Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting unique element in large array

One of my colleagues was asked this question in an interview.

Given a huge array which stores unsigned int. Length of array is 100000000. Find the effective way to count the unique number of elements present in the array.

E.g arr = {2,34,5,6,7,2,2,5,1,34,5}

O/p: Count of 2 is 3, Count of 34 is 2 and so on.

What are effective algorithms to do this? I thought at first dictionary/hash would be one of the options, but since the array is very large it is inefficient. Is there any way to do this?

like image 488
sam_33 Avatar asked Feb 07 '11 19:02

sam_33


People also ask

How do you count unique values in an array?

To count the unique elements in an array, pass the array to the Set constructor and access the size property on the set. The Set object only stores unique values and automatically removes duplicates. The size property returns the number of values in the Set .


1 Answers

Heap sort is O(nlogn) and in-place. In-place is necessary when dealing with large data sets. Once sorted you can make one pass through the array tallying occurrences of each value. Because the array is sorted, once a value changes you know you've seen all occurrences of the previous value.

like image 148
Girish Rao Avatar answered Sep 18 '22 01:09

Girish Rao