Using C++, and hopefully the standard library, I want to sort a sequence of samples in ascending order, but I also want to remember the original indexes of the new samples. For example, I have a set, or vector, or matrix of samples <code>A : [5, 2, 1, 4, 3]</code>. I want to sort these to be <code>B : [1,2,3,4,5]</code>, but I also want to remember the original indexes of the values, so I can get another set which would be: <code>C : [2, 1, 4, 3, 0 ]</code> - which corresponds to the index of each element in 'B', in the original 'A'. For example, in Matlab you can do: <pre class="prettyprint"><code> [a,b]=sort([5, 8, 7]) a = 5 7 8 b = 1 3 2 </code></pre> Can anyone see a good way to do this?

You could sort std::pair instead of just ints - first int is original data, second int is original index. Then supply a comparator that only sorts on the first int. Example: <pre class="prettyprint"><code>Your problem instance: v = [5 7 8] New problem instance: v_prime = [<5,0>, <8,1>, <7,2>] </code></pre> Sort the new problem instance using a comparator like: <pre class="prettyprint"><code>typedef std::pair<int,int> mypair; bool comparator ( const mypair& l, const mypair& r) { return l.first < r.first; } // forgetting the syntax here but intent is clear enough </code></pre> The result of std::sort on v_prime, using that comparator, should be: <pre class="prettyprint"><code>v_prime = [<5,0>, <7,2>, <8,1>] </code></pre> You can peel out the indices by walking the vector, grabbing .second from each std::pair.

<pre class="prettyprint"><code>vector<pair<int,int> >a; for (i = 0 ;i < n ; i++) { // filling the original array cin >> k; a.push_back (make_pair (k,i)); // k = value, i = original index } sort (a.begin(),a.end()); for (i = 0 ; i < n ; i++){ cout << a[i].first << " " << a[i].second << "\n"; } </code></pre> Now <code>a</code> contains both both our values and their respective indices in the sorted. <code>a[i].first = value</code> at <code>i</code>'th. <code>a[i].second = idx</code> in initial array.

I came across this question, and figured out sorting the iterators directly would be a way to sort the values and keep track of indices; There is no need to define an extra container of <code>pair</code>s of ( value, index ) which is helpful when the values are large objects; The iterators provides the access to both the value and the index: <pre class="prettyprint"><code>/* * a function object that allows to compare * the iterators by the value they point to */ template < class RAIter, class Compare > class IterSortComp { public: IterSortComp ( Compare comp ): m_comp ( comp ) { } inline bool operator( ) ( const RAIter & i, const RAIter & j ) const { return m_comp ( * i, * j ); } private: const Compare m_comp; }; template <class INIter, class RAIter, class Compare> void itersort ( INIter first, INIter last, std::vector < RAIter > & idx, Compare comp ) { idx.resize ( std::distance ( first, last ) ); for ( typename std::vector < RAIter >::iterator j = idx.begin( ); first != last; ++ j, ++ first ) * j = first; std::sort ( idx.begin( ), idx.end( ), IterSortComp< RAIter, Compare > ( comp ) ); } </code></pre> as for the usage example: <pre class="prettyprint"><code>std::vector < int > A ( n ); // populate A with some random values std::generate ( A.begin( ), A.end( ), rand ); std::vector < std::vector < int >::const_iterator > idx; itersort ( A.begin( ), A.end( ), idx, std::less < int > ( ) ); </code></pre> now, for example, the 5th smallest element in the sorted vector would have value <code>**idx[ 5 ]</code> and its index in the original vector would be <code>distance( A.begin( ), *idx[ 5 ] )</code> or simply <code>*idx[ 5 ] - A.begin( )</code>.

Consider using <code>std::multimap</code> as suggested by @Ulrich Eckhardt. Just that the code could be made even simpler. Given <pre class="prettyprint"><code>std::vector<int> a = {5, 2, 1, 4, 3}; // a: 5 2 1 4 3 </code></pre> To sort in the mean time of insertion <pre class="prettyprint"><code>std::multimap<int, std::size_t> mm; for (std::size_t i = 0; i != a.size(); ++i) mm.insert({a[i], i}); </code></pre> To retrieve values and original indices <pre class="prettyprint"><code>std::vector<int> b; std::vector<std::size_t> c; for (const auto & kv : mm) { b.push_back(kv.first); // b: 1 2 3 4 5 c.push_back(kv.second); // c: 2 1 4 3 0 } </code></pre> The reason to prefer a <code>std::multimap</code> to a <code>std::map</code> is to allow equal values in original vectors. Also please note that, unlike for <code>std::map</code>, <code>operator[]</code> is not defined for <code>std::multimap</code>.

There is another way to solve this, using a map: <pre class="prettyprint"><code>vector<double> v = {...}; // input data map<double, unsigned> m; // mapping from value to its index for (auto it = v.begin(); it != v.end(); ++it) m[*it] = it - v.begin(); </code></pre> This will eradicate non-unique elements though. If that's not acceptable, use a multimap: <pre class="prettyprint"><code>vector<double> v = {...}; // input data multimap<double, unsigned> m; // mapping from value to its index for (auto it = v.begin(); it != v.end(); ++it) m.insert(make_pair(*it, it - v.begin())); </code></pre> In order to output the indices, iterate over the map or multimap: <pre class="prettyprint"><code>for (auto it = m.begin(); it != m.end(); ++it) cout << it->second << endl; </code></pre>

Beautiful solution by @Lukasz Wiklendt! Although in my case I needed something more generic so I modified it a bit: <pre class="prettyprint"><code>template <class RAIter, class Compare> vector<size_t> argSort(RAIter first, RAIter last, Compare comp) { vector<size_t> idx(last-first); iota(idx.begin(), idx.end(), 0); auto idxComp = [&first,comp](size_t i1, size_t i2) { return comp(first[i1], first[i2]); }; sort(idx.begin(), idx.end(), idxComp); return idx; } </code></pre> Example: Find indices sorting a vector of strings by length, except for the first element which is a dummy. <pre class="prettyprint"><code>vector<string> test = {"dummy", "a", "abc", "ab"}; auto comp = [](const string &a, const string& b) { return a.length() > b.length(); }; const auto& beginIt = test.begin() + 1; vector<size_t> ind = argSort(beginIt, test.end(), comp); for(auto i : ind) cout << beginIt[i] << endl; </code></pre> prints: <pre class="prettyprint"><code>abc ab a </code></pre>

Make a <code>std::pair</code> in function then sort pair : generic version : <pre class="prettyprint"><code>template< class RandomAccessIterator,class Compare > auto sort2(RandomAccessIterator begin,RandomAccessIterator end,Compare cmp) -> std::vector<std::pair<std::uint32_t,RandomAccessIterator>> { using valueType=typename std::iterator_traits<RandomAccessIterator>::value_type; using Pair=std::pair<std::uint32_t,RandomAccessIterator>; std::vector<Pair> index_pair; index_pair.reserve(std::distance(begin,end)); for(uint32_t idx=0;begin!=end;++begin,++idx){ index_pair.push_back(Pair(idx,begin)); } std::sort( index_pair.begin(),index_pair.end(),[&](const Pair& lhs,const Pair& rhs){ return cmp(*lhs.second,*rhs.second); }); return index_pair; } </code></pre> ideone

C++ sorting and keeping track of indexes

Tags:

c++

sorting

indexing

stl

Using C++, and hopefully the standard library, I want to sort a sequence of samples in ascending order, but I also want to remember the original indexes of the new samples.

For example, I have a set, or vector, or matrix of samples A : [5, 2, 1, 4, 3]. I want to sort these to be B : [1,2,3,4,5], but I also want to remember the original indexes of the values, so I can get another set which would be: C : [2, 1, 4, 3, 0 ] - which corresponds to the index of each element in 'B', in the original 'A'.

For example, in Matlab you can do:

 [a,b]=sort([5, 8, 7])
 a = 5 7 8
 b = 1 3 2

Can anyone see a good way to do this?

931

asked Oct 16 '09 11:10

Mingus

13 Answers

Using C++ 11 lambdas:

#include <iostream>
#include <vector>
#include <numeric>      // std::iota
#include <algorithm>    // std::sort, std::stable_sort

using namespace std;

template <typename T>
vector<size_t> sort_indexes(const vector<T> &v) {

  // initialize original index locations
  vector<size_t> idx(v.size());
  iota(idx.begin(), idx.end(), 0);

  // sort indexes based on comparing values in v
  // using std::stable_sort instead of std::sort
  // to avoid unnecessary index re-orderings
  // when v contains elements of equal values 
  stable_sort(idx.begin(), idx.end(),
       [&v](size_t i1, size_t i2) {return v[i1] < v[i2];});

  return idx;
}

Now you can use the returned index vector in iterations such as

for (auto i: sort_indexes(v)) {
  cout << v[i] << endl;
}

You can also choose to supply your original index vector, sort function, comparator, or automatically reorder v in the sort_indexes function using an extra vector.

150

answered Oct 05 '22 10:10

Lukasz Wiklendt

You could sort std::pair instead of just ints - first int is original data, second int is original index. Then supply a comparator that only sorts on the first int. Example:

Your problem instance: v = [5 7 8]
New problem instance: v_prime = [<5,0>, <8,1>, <7,2>]

Sort the new problem instance using a comparator like:

typedef std::pair<int,int> mypair;
bool comparator ( const mypair& l, const mypair& r)
   { return l.first < r.first; }
// forgetting the syntax here but intent is clear enough

The result of std::sort on v_prime, using that comparator, should be:

v_prime = [<5,0>, <7,2>, <8,1>]

You can peel out the indices by walking the vector, grabbing .second from each std::pair.

answered Oct 05 '22 10:10

RAC

Suppose Given vector is

A=[2,4,3]

Create a new vector

V=[0,1,2] // indicating positions

Sort V and while sorting instead of comparing elements of V , compare corresponding elements of A

 //Assume A is a given vector with N elements
 vector<int> V(N);
 std::iota(V.begin(),V.end(),0); //Initializing
 sort( V.begin(),V.end(), [&](int i,int j){return A[i]<A[j];} );

answered Oct 05 '22 08:10

MysticForce

vector<pair<int,int> >a;

for (i = 0 ;i < n ; i++) {
    // filling the original array
    cin >> k;
    a.push_back (make_pair (k,i)); // k = value, i = original index
}

sort (a.begin(),a.end());

for (i = 0 ; i < n ; i++){
    cout << a[i].first << " " << a[i].second << "\n";
}

Now a contains both both our values and their respective indices in the sorted.

a[i].first = value at i'th.

a[i].second = idx in initial array.

answered Oct 05 '22 09:10

Aditya Aswal

I wrote generic version of index sort.

template <class RAIter, class Compare>
void argsort(RAIter iterBegin, RAIter iterEnd, Compare comp, 
    std::vector<size_t>& indexes) {

    std::vector< std::pair<size_t,RAIter> > pv ;
    pv.reserve(iterEnd - iterBegin) ;

    RAIter iter ;
    size_t k ;
    for (iter = iterBegin, k = 0 ; iter != iterEnd ; iter++, k++) {
        pv.push_back( std::pair<int,RAIter>(k,iter) ) ;
    }

    std::sort(pv.begin(), pv.end(), 
        [&comp](const std::pair<size_t,RAIter>& a, const std::pair<size_t,RAIter>& b) -> bool 
        { return comp(*a.second, *b.second) ; }) ;

    indexes.resize(pv.size()) ;
    std::transform(pv.begin(), pv.end(), indexes.begin(), 
        [](const std::pair<size_t,RAIter>& a) -> size_t { return a.first ; }) ;
}

Usage is the same as that of std::sort except for an index container to receive sorted indexes. testing:

int a[] = { 3, 1, 0, 4 } ;
std::vector<size_t> indexes ;
argsort(a, a + sizeof(a) / sizeof(a[0]), std::less<int>(), indexes) ;
for (size_t i : indexes) printf("%d\n", int(i)) ;

you should get 2 1 0 3. for the compilers without c++0x support, replace the lamba expression as a class template:

template <class RAIter, class Compare> 
class PairComp {
public:
  Compare comp ;
  PairComp(Compare comp_) : comp(comp_) {}
  bool operator() (const std::pair<size_t,RAIter>& a, 
    const std::pair<size_t,RAIter>& b) const { return comp(*a.second, *b.second) ; }        
} ;

and rewrite std::sort as

std::sort(pv.begin(), pv.end(), PairComp(comp)()) ;

answered Oct 05 '22 09:10

hkyi

I came across this question, and figured out sorting the iterators directly would be a way to sort the values and keep track of indices; There is no need to define an extra container of pairs of ( value, index ) which is helpful when the values are large objects; The iterators provides the access to both the value and the index:

/*
 * a function object that allows to compare
 * the iterators by the value they point to
 */
template < class RAIter, class Compare >
class IterSortComp
{
    public:
        IterSortComp ( Compare comp ): m_comp ( comp ) { }
        inline bool operator( ) ( const RAIter & i, const RAIter & j ) const
        {
            return m_comp ( * i, * j );
        }
    private:
        const Compare m_comp;
};

template <class INIter, class RAIter, class Compare>
void itersort ( INIter first, INIter last, std::vector < RAIter > & idx, Compare comp )
{ 
    idx.resize ( std::distance ( first, last ) );
    for ( typename std::vector < RAIter >::iterator j = idx.begin( ); first != last; ++ j, ++ first )
        * j = first;

    std::sort ( idx.begin( ), idx.end( ), IterSortComp< RAIter, Compare > ( comp ) );
}

as for the usage example:

std::vector < int > A ( n );

// populate A with some random values
std::generate ( A.begin( ), A.end( ), rand );

std::vector < std::vector < int >::const_iterator > idx;
itersort ( A.begin( ), A.end( ), idx, std::less < int > ( ) );

now, for example, the 5th smallest element in the sorted vector would have value **idx[ 5 ] and its index in the original vector would be distance( A.begin( ), *idx[ 5 ] ) or simply *idx[ 5 ] - A.begin( ).

answered Oct 05 '22 08:10

behzad.nouri

Consider using std::multimap as suggested by @Ulrich Eckhardt. Just that the code could be made even simpler.

Given

std::vector<int> a = {5, 2, 1, 4, 3};  // a: 5 2 1 4 3

To sort in the mean time of insertion

std::multimap<int, std::size_t> mm;
for (std::size_t i = 0; i != a.size(); ++i)
    mm.insert({a[i], i});

To retrieve values and original indices

std::vector<int> b;
std::vector<std::size_t> c;
for (const auto & kv : mm) {
    b.push_back(kv.first);             // b: 1 2 3 4 5
    c.push_back(kv.second);            // c: 2 1 4 3 0
}

The reason to prefer a std::multimap to a std::map is to allow equal values in original vectors. Also please note that, unlike for std::map, operator[] is not defined for std::multimap.

answered Oct 05 '22 09:10

aafulei

There is another way to solve this, using a map:

vector<double> v = {...}; // input data
map<double, unsigned> m; // mapping from value to its index
for (auto it = v.begin(); it != v.end(); ++it)
    m[*it] = it - v.begin();

This will eradicate non-unique elements though. If that's not acceptable, use a multimap:

vector<double> v = {...}; // input data
multimap<double, unsigned> m; // mapping from value to its index
for (auto it = v.begin(); it != v.end(); ++it)
    m.insert(make_pair(*it, it - v.begin()));

In order to output the indices, iterate over the map or multimap:

for (auto it = m.begin(); it != m.end(); ++it)
    cout << it->second << endl;

answered Oct 05 '22 10:10

Ulrich Eckhardt

Beautiful solution by @Lukasz Wiklendt! Although in my case I needed something more generic so I modified it a bit:

template <class RAIter, class Compare>
vector<size_t> argSort(RAIter first, RAIter last, Compare comp) {

  vector<size_t> idx(last-first);
  iota(idx.begin(), idx.end(), 0);

  auto idxComp = [&first,comp](size_t i1, size_t i2) {
      return comp(first[i1], first[i2]);
  };

  sort(idx.begin(), idx.end(), idxComp);

  return idx;
}

Example: Find indices sorting a vector of strings by length, except for the first element which is a dummy.

vector<string> test = {"dummy", "a", "abc", "ab"};

auto comp = [](const string &a, const string& b) {
    return a.length() > b.length();
};

const auto& beginIt = test.begin() + 1;
vector<size_t> ind = argSort(beginIt, test.end(), comp);

for(auto i : ind)
    cout << beginIt[i] << endl;

prints:

abc
ab
a

answered Oct 05 '22 08:10

sigvaldm

Make a std::pair in function then sort pair :

generic version :

template< class RandomAccessIterator,class Compare >
auto sort2(RandomAccessIterator begin,RandomAccessIterator end,Compare cmp) ->
   std::vector<std::pair<std::uint32_t,RandomAccessIterator>>
{
    using valueType=typename std::iterator_traits<RandomAccessIterator>::value_type;
    using Pair=std::pair<std::uint32_t,RandomAccessIterator>;

    std::vector<Pair> index_pair;
    index_pair.reserve(std::distance(begin,end));

    for(uint32_t idx=0;begin!=end;++begin,++idx){
        index_pair.push_back(Pair(idx,begin));
    }

    std::sort( index_pair.begin(),index_pair.end(),[&](const Pair& lhs,const Pair& rhs){
          return cmp(*lhs.second,*rhs.second);
    });

    return index_pair;
}

ideone

answered Oct 05 '22 10:10

uchar

Well, my solution uses residue technique. We can place the values under sorting in the upper 2 bytes and the indices of the elements - in the lower 2 bytes:

int myints[] = {32,71,12,45,26,80,53,33};

for (int i = 0; i < 8; i++)
   myints[i] = myints[i]*(1 << 16) + i;

Then sort the array myints as usual:

std::vector<int> myvector(myints, myints+8);
sort(myvector.begin(), myvector.begin()+8, std::less<int>());

After that you can access the elements' indices via residuum. The following code prints the indices of the values sorted in the ascending order:

for (std::vector<int>::iterator it = myvector.begin(); it != myvector.end(); ++it)
   std::cout << ' ' << (*it)%(1 << 16);

Of course, this technique works only for the relatively small values in the original array myints (i.e. those which can fit into upper 2 bytes of int). But it has additional benefit of distinguishing identical values of myints: their indices will be printed in the right order.

answered Oct 05 '22 08:10

Macmep

If it's possible, you can build the position array using find function, and then sort the array.

Or maybe you can use a map where the key would be the element, and the values a list of its position in the upcoming arrays (A, B and C)

It depends on later uses of those arrays.

answered Oct 05 '22 08:10

HyLian

I recently stepped upon the elegant projection feature of C++20 <ranges> and it allows to write shorter/clearer code:

std::vector<std::size_t> B(std::size(A));
std::iota(begin(B), end(B), 0);
std::ranges::sort(B, {}, [&](std::size_t i){ return A[i]; });

{} refers to the usual std::less<std::size_t>. So as you can see we define a function to call on each element before any comparaison. This projection feature is actually quite powerful since this function can be, as here, a lambda or it can even be a method, or a member value. For instance:

struct Item {
    float price;
    float weight;
    float efficiency() const { return price / weight; }
};

int main() {
    std::vector<Item> items{{7, 9}, {3, 4}, {5, 3}, {9, 7}};
    std::ranges::sort(items, std::greater<>(), &Item::efficiency);
    // now items are sorted by their efficiency in decreasing order:
    // items = {{5, 3}, {9, 7}, {7, 9}, {3, 4}}
}

If we wanted to sort by increasing price:

std::ranges::sort(items, {}, &Item::price);

Don't define operator< or use lambdas, use a projection!

answered Oct 05 '22 08:10

Arthur R.

Related questions
                            
                                round() for float in C++
                            
                                Is "delete this" allowed in C++?
                            
                                Count character occurrences in a string in C++
                            
                                C++ "virtual" keyword for functions in derived classes. Is it necessary?
                            
                                The static keyword and its various uses in C++
                            
                                Why should I prefer to use member initialization lists?
                            
                                (-2147483648> 0) returns true in C++?
                            
                                g++ undefined reference to typeinfo
                            
                                What is the best open XML parser for C++? [duplicate]
                            
                                Significance of ios_base::sync_with_stdio(false); cin.tie(NULL);
                            
                                Why does volatile exist?
                            
                                What is the difference between "long", "long long", "long int", and "long long int" in C++?
                            
                                Can lambda functions be templated?
                            
                                Is the 'override' keyword just a check for a overridden virtual method?
                            
                                What does "default" mean after a class' function declaration?
                            
                                Function passed as template argument
                            
                                Remove last character from C++ string
                            
                                Why can I not push_back a unique_ptr into a vector?
                            
                                What is "rvalue reference for *this"?
                            
                                What are forward declarations in C++?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

C++ sorting and keeping track of indexes

Tags:

c++

sorting

indexing

stl

Mingus

People also ask

13 Answers

Lukasz Wiklendt

RAC

MysticForce

Aditya Aswal

hkyi

behzad.nouri

aafulei

Ulrich Eckhardt

sigvaldm

uchar

Macmep

HyLian

Arthur R.

Recent Activity

Donate For Us