Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

indices of the k largest elements in an unsorted length n array

I need to find the indices of the k largest elements of an unsorted, length n, array/vector in C++, with k < n. I have seen how to use nth_element() to find the k-th statistic, but I'm not sure if using this is the right choice for my problem as it seems like I would need to make k calls to nth_statistic, which I guess it would have complexity O(kn), which may be as good as it can get? Or is there a way to do this just in O(n)?

Implementing it without nth_element() seems like I will have to iterate over the whole array once, populating a list of indices of the largest elements at each step.

Is there anything in the standard C++ library that makes this a one-liner or any clever way to implement this myself in just a couple lines? In my particular case, k = 3, and n = 6, so efficiency isn't a huge concern, but it would be nice to find a clean and efficient way to do this for arbitrary k and n.

It looks like Mark the top N elements of an unsorted array is probably the closest posting I can find on SO, the postings there are in Python and PHP.

like image 512
Luke Peterson Avatar asked Feb 15 '13 20:02

Luke Peterson


People also ask

What is k th largest element in an array?

It can be clearly observed that Kth largest element is the same as (N – K)th smallest element, where N is the size of the given array. Therefore, we can apply the Kth smallest approach to this problem. Firstly, a pivot element must be chosen, similar to what we do in quicksort.

How do you find the kth element from an unsorted array?

Approach: Using Max Heap Observe the following algorithm. Step 1: Using the first k elements of the input array (a[0], …, a[k - 1], create a Max-Heap. Step 2: Compare each element that is coming after the k'th element (a[k] to a[n - 1]) with the root element of the max-heap.

How do you find the kth largest element in unsorted array using heap?

Using Max Heap log(n)) by using a max-heap. The idea is to simply construct a max-heap of size n and insert all the array elements [0…n-1] into it. Then pop first k-1 elements from it. Now k'th largest element will reside at the root of the max-heap.


2 Answers

Here is my implementation that does what I want and I think is reasonably efficient:

#include <queue>
#include <vector>
// maxindices.cc
// compile with:
// g++ -std=c++11 maxindices.cc -o maxindices
int main()
{
  std::vector<double> test = {0.2, 1.0, 0.01, 3.0, 0.002, -1.0, -20};
  std::priority_queue<std::pair<double, int>> q;
  for (int i = 0; i < test.size(); ++i) {
    q.push(std::pair<double, int>(test[i], i));
  }
  int k = 3; // number of indices we need
  for (int i = 0; i < k; ++i) {
    int ki = q.top().second;
    std::cout << "index[" << i << "] = " << ki << std::endl;
    q.pop();
  }
}

which gives output:

index[0] = 3
index[1] = 1
index[2] = 0
like image 51
Luke Peterson Avatar answered Oct 24 '22 19:10

Luke Peterson


This should be an improved version of @hazelnusse which is executed in O(nlogk) instead of O(nlogn)

#include <queue>
#include <iostream>
#include <vector>
// maxindices.cc
// compile with:
// g++ -std=c++11 maxindices.cc -o maxindices
int main()
{
  std::vector<double> test = {2, 8, 7, 5, 9, 3, 6, 1, 10, 4};
  std::priority_queue< std::pair<double, int>, std::vector< std::pair<double, int> >, std::greater <std::pair<double, int> > > q;
    int k = 5; // number of indices we need
  for (int i = 0; i < test.size(); ++i) {
    if(q.size()<k)
        q.push(std::pair<double, int>(test[i], i));
    else if(q.top().first < test[i]){
        q.pop();
        q.push(std::pair<double, int>(test[i], i));
    }
  }
  k = q.size();
  std::vector<int> res(k);
  for (int i = 0; i < k; ++i) {
    res[k - i - 1] = q.top().second;
    q.pop();
  }
  for (int i = 0; i < k; ++i) {
    std::cout<< res[i] <<std::endl;
  }
}

8 4 1 2 6

like image 32
justHelloWorld Avatar answered Oct 24 '22 18:10

justHelloWorld