Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sorting an array in O(n log(log n))

Tags:

I am attempting to solve the following assignment: I'm given an array of n elements. it is known that not all keys of the array are distinct, but it is given that we have k distinct elements (k<=n of course). the assignment is to do a stable sort of the array in O(n log(log n)) worst case while k=O(log n). I'm allowed to use O(n) extra memory.

My current solution is described below:


Create a hash table with chaining of size k that does the following: if the hash function tries to insert an element to a place that already has a value in it - it checks if they are equal - if they are it adds it to the list, if not it starts moving in the array until it finds a place that has that same value or an empty space(which ever comes first).

This way the lists in each place only contains elements with equal keys. the insertion to the hashtable is from start to finish in the original array so each list is stably sorted. Then sort the array in the hash table with mergeSort (for lists we treat the first element as just one and move it).

After we are done merge sorting we copy the elements back to the original array by order and whenever we meet a list we copy each element by order.

Here is what I'm not sure about:

Is it true to say that because the hash table is size k and we only have k different elements, uniform hashing promises that the amount of time the hash function will try to give different values the same place in the array is negligible and therefore it's build time complexity is O(n)?

Because if so it seems the algorithm's runtime is O(n + k log k) = O(n + log n*log(log n)). which is definitely better then O(n log k) which what was required.

like image 794
Nadav matityahu Avatar asked Jun 12 '18 17:06

Nadav matityahu


2 Answers

I think you're on the right track with the hash table, but I don't think you should insert the elements in the hash table, then copy them out again. You should use the hash table only to count the number of elements for each distinct value.

Next you compute the starting index for each distinct value, by traversing all values in order and adding the previous element's count to its start index:

  • start index for element i = start index for element i-1 + count for element i-1.

This step requires sorting the k elements in the hash table, which amounts to O(k log k) = O(log n log log n) operations, much less than the O(n) for steps 1 and 3.

Finally, you traverse your input array again, look it up in the table, and find the location in the output array for it. You copy the element, and also increase the start index for elements of its value, so that the next element will be copied after it.

If the comparison values for the items are consecutive integers (or integers in some small range), then you can use an array instead of a hash table for counting.

This is called counting sort.

like image 153
Cris Luengo Avatar answered Sep 28 '22 19:09

Cris Luengo


on the same note as here:

You create a binary tree, any tree: each node would be a list, of elements, with a distinct key.

now we iterate the array, and for each key we search for it in the tree, the search would take log(log(n)), as there is a maximum of log(n) distinct nodes in the tree. ( if the key doesnt exist we just add it as a node to the tree ).

so iterating the array would take O(n*log(log(n))) as we gave n elements.

finally, as this is a binary search tree, we can call in order, and would get the sorted order of the arrays.

and all that left is to combine them into single array. that would take O(n) time.

so we get O(n + nlog(log(n))=O(nlog(log(n)))

like image 37
Mtkel N Avatar answered Sep 28 '22 17:09

Mtkel N