One of my colleagues was asked this question in an interview.
Given a huge array which stores unsigned int. Length of array is 100000000. Find the effective way to count the unique number of elements present in the array.
E.g arr = {2,34,5,6,7,2,2,5,1,34,5}
O/p: Count of 2 is 3, Count of 34 is 2 and so on.
What are effective algorithms to do this? I thought at first dictionary/hash would be one of the options, but since the array is very large it is inefficient. Is there any way to do this?
To count the unique elements in an array, pass the array to the Set constructor and access the size property on the set. The Set object only stores unique values and automatically removes duplicates. The size property returns the number of values in the Set .
Heap sort is O(nlogn) and in-place. In-place is necessary when dealing with large data sets. Once sorted you can make one pass through the array tallying occurrences of each value. Because the array is sorted, once a value changes you know you've seen all occurrences of the previous value.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With