in-place coalescing like elements in a std::vector

Question

I have an array of pairs, for example:

X = {{A, 1}, {B, 2}, {C, 1}, {A, 3}, {C, 4}}

I would like to produce an array:

Y = (x, n) such that n = sum i for (x, i) in X

so in the example above, we'd have:

Y = {{A, 4}, {B, 2}, {C, 5}}

The code I currently have is:

#include <vector>
#include <algorithm>
#include <iostream>

using namespace std;

int main() {

    char A = 'A';
    char B = 'B';
    char C = 'C'; 

    vector< pair<char, int> > X = {{A, 1}, {B, 2}, {C, 1}, {A, 3}, {C, 4}};

    // Sort by first element of the pair
    sort(begin(X), end(X), [](auto a, auto b) { return a.first < b.first; });

    // Could this be better? Is there an existing STL algorithm that will
    // do this in-place?
    vector< pair<char, int> > Y;
    for(auto p : X) {
        if(Y.empty() || Y.back().first != p.first) {
            Y.push_back(p);
        } else {
            Y.back().second += p.second;
        }
    }

    cout << "Y:";
    for (auto p : Y) {
       cout << '{' << p.first << ' ' << p.second << '}';
    }
    cout << '
';

}

Could this code be made more succinct? (Without changing the type of the underlying container)

I'm looking to try to eliminate the raw loop, by replacing with an algorithm in the standard library, but I don't see one that would quite fit.

I'd want some variant of std::unique which takes not only a predicate for whether two elements are equivalent, but also a function which defines how to combine them. It might look like:

coalesce(begin(X), end(X), [](auto a, auto b){ return a.first == b.first; }, [](auto a, auto b) { return {a.first, a.second+b.second} });

FWIW, here's an implementation of coalesce which seems to work:

template<class ForwardIt, class BinaryPredicate, class BinaryFunction>
ForwardIt coalesce(ForwardIt first, ForwardIt last, BinaryPredicate p, BinaryFunction f)
{
    if (first == last)
        return last;

    ForwardIt result = first;
    while (++first != last) {
        if(p(*result, *first)) {
            *result = f(*result, *first);
        } else {
            ++result;
            *result = *first;
        }
    }
    return ++result;
}

And the code becomes:

    vector< pair<char, int> > X = {{A, 1}, {B, 2}, {C, 1}, {A, 3}, {C, 4}};

    // Sort by first element of the pair
    sort(begin(X), end(X), [](auto a, auto b) { return a.first < b.first; });

    // Easier to understand the intent!
    auto e = coalesce(begin(X), end(X),
                      [](auto a, auto b) { return a.first == b.first; },
                      [](auto a, auto b) { return pair<char, int>{a.first, a.second+b.second}; });

    for_each(begin(X), e, [](auto p) {
        cout << '{' << p.first << ' ' << p.second << '}';
    });
    cout << '
';

NOTE: I'm quite familiar with map, etc, and don't want to use it.

AndyG · Accepted Answer

Hm, an approach that uses no other containers, with no raw loops (or std::for_each) might combine std::sort with std::partial_sum

std::partial_sum is for computing prefix sums, or rather a generic way to combine adjacent elements. After our initial sort, we can use std::partial_sum to combine elements with the same key:

std::vector< std::pair<char, int> > Y;
std::vector< std::pair<char, int> > Y(X.size());
std::partial_sum(X.begin(), X.end(),  Y.rbegin(), [](const auto& lhs, const auto& rhs)
{
    if (lhs.first != rhs.first)
        return rhs;
    return std::make_pair(lhs.first, lhs.second + rhs.second);
});

Notice that we iterate backwards in Y. This is intentional for the next step, which I'll elaborate shortly.

This gets us part of the way there. Now we have a Y that looks like this:

Y:{C 5}{C 1}{B 2}{A 4}{A 1}

Now our task is to remove the duplicates, which we can do with std::unique:

Y.erase(std::unique(Y.begin(), Y.end(), 
   [](const auto& lhs, const auto& rhs){
      return lhs.first == rhs.first;}), Y.end());

We needed to use partial_sum over the reversed range, because std::unique "Eliminates all but the first element from every consecutive group of equivalent elements", and we needed the final partial_sum to appear first.

The total algorithm is O(N log N) on account of the sorting. Memory usage is O(N).

in-place coalescing like elements in a std::vector

Tags:

c++

algorithm

std

Taylor

1 Answers

Demo

AndyG

Recent Activity

Donate For Us

in-place coalescing like elements in a std::vector

Tags:

c++

algorithm

std

Taylor

1 Answers

Demo

AndyG

Related questions

Recent Activity

Donate For Us