Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

algorithm - Sort an array with LogLogN distinct elements

This is not my school home work. This is my own home work and I am self-learning algorithms.

In Algorithm Design Manual, there is such an excise

4-25 Assume that the array A[1..n] only has numbers from {1, . . . , n^2} but that at most log log n of these numbers ever appear. Devise an algorithm that sorts A in substantially less than O(n log n).

I have two approaches:


The first approach:

Basically I want to do counting sort for this problem. I can first scan the whole array (O(N)) and put all distinct numbers into a loglogN size array (int[] K).

Then apply counting sort. However, when setting up the counting array (int[] C), I don't need to set its size as N^2, instead, I set the size as loglogN too.

But in this way, when counting the frequencies of each distinct number, I have to scan array K to get that element's index (O(NloglogN) and then update array C.


The second approach:

Again, I have to scan the whole array to get a distinct number array K with size loglogN.

Then I just do a kind of quicksort like, but the partition is based on median of K array (i.e., each time the pivot is an element of K array), recursively.

I think this approach will be best, with O(NlogloglogN).


Am I right? or there are better solutions?

Similar excises exist in Algorithm Design Manual, such as

4-22 Show that n positive integers in the range 1 to k can be sorted in O(n log k) time. The interesting case is when k << n.

4-23 We seek to sort a sequence S of n integers with many duplications, such that the number of distinct integers in S is O(log n). Give an O(n log log n) worst-case time algorithm to sort such sequences.

But basically for all these excises, my intuitive was always thinking of counting sort as we can know the range of the elements and the range is short enough comparing to the length of the whole array. But after more deeply thinking, I guess what the excises are looking for is the 2nd approach, right?

Thanks

like image 478
Jackson Tale Avatar asked Apr 01 '12 12:04

Jackson Tale


1 Answers

We can just create a hash map storing each element as key and its frequency as value.

Sort this map in log(n)*log(log(n)) time i.e (klogk) using any sorting algorithm.

Now scan the hash map and add elements to the new array frequency number of times. Like so:

total time = 2n+log(n)*log(log(n)) = O(n) 
like image 98
tranquil Avatar answered Nov 19 '22 16:11

tranquil