Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What algorithm used to find the nth sorted subarray of an unordered array?

Tags:

c++

algorithm

I had this question recently in an interview and I failed, and now search for the answer.

  1. Let's say I have a big array of n integers, all differents.

  2. If this array was ordered, I could subdivide it in x smaller arrays, all of size y, except maybe the last one, which could be less. I could then extract the nth subarray and return it, already sorted.

Example : Array 4 2 5 1 6 3. If y=2 and I want the 2nd array, it would be 3 4.

Now what I did is simply sort the array and return the nth subarray, which takes O(n log n). But it was said to me that there exists a way to do it in O(n + y log y). I searched on internet and didn't find anything. Ideas ?

like image 253
Bene Tleilax Avatar asked Nov 27 '15 12:11

Bene Tleilax


2 Answers

The algorithm you are looking for is Selection Algorithm, which lets you find k-th order statistics in linear time. The algorithm is quite complex, but the standard C++ library conveniently provides an implementation of it.

The algorithm for finding k-th sorted interval that the interviewers had in mind went like this:

  • Find b=(k-1)*y-th order statistics in O(N)
  • Find e=k*y-th order statistics in O(N)
  • There will be y numbers between b and e. Store them in a separate array of size y. This operation takes O(N)
  • Sort the array of size y for O(y * log2y) cost.

The overall cost is O(N+N+N+y * log2y), i.e. O(N+y * log2y)

like image 101
Sergey Kalinichenko Avatar answered Sep 30 '22 06:09

Sergey Kalinichenko


You can combine std::nth_element and std::sort for this:

std::vector<int> vec = muchData();
// Fix those bound iterators as needed
auto lower = vec.begin() + k*y;
auto upper = lower + y;

// put right element at lower and partition vector by it
std::nth_element(vec.begin(), lower, vec.end());
// Same for upper, but don't mess up lower
std::nth_element(lower + 1, upper - 1, vec.end());
// Now sort the subarray
std::sort(lower, upper);

[lower, upper) is now the k-th sorted subarray of length y, with the desired complexity on average.

To be checked for special cases like y = 1 before real world use, but this is the general idea.

like image 38
Baum mit Augen Avatar answered Sep 30 '22 06:09

Baum mit Augen