can someone come up with a clean (and fast) solution to the following problem:
struct Value {
int index = 0;
int cost = 0;
}
index
should be only contained once in the sequence, and the cost
for each duplicate index should be accumulated.The basic solution I came up with sorts the sequence, and when equal entries are detected in the BinaryPredicate
passed to std::sort
, the cost
will be summed into the lhs
. Then the cost of rhs
will be set to 0. Then follows a remove_if
which removes the 0-cost values. See here for an example:
#include <cstdlib>
#include <vector>
#include <algorithm>
#include <iostream>
struct Value
{
int index = 0;
int cost = 0;
};
// generate a bunch of random values in a vector
// values will have indices in range [0..10]
std::vector<Value> generator()
{
std::vector<Value> v(20);
std::generate(v.begin(), v.end(), []() { return Value{std::rand() % 10, std::rand() % 10}; });
return v;
}
void print(const std::vector<Value> &values)
{
for (auto v : values)
std::cout << "{i=" << v.index << ", c=" << v.cost << "}, ";
std::cout << "\n";
}
//
void merge(std::vector<Value> &values)
{
// sort values and merge costs
std::sort(values.begin(), values.end(), [](auto &lhs , auto &rhs) {
if (lhs.index == rhs.index) {
lhs.cost += rhs.cost;
rhs.cost = 0;
}
return lhs.index < rhs.index;
});
// remove entries with empty cost
auto it = std::remove_if(values.begin(), values.end(), [](const auto &v) { return v.cost == 0; });
values.erase(it, values.end());
}
int main()
{
auto v = generator();
std::cout << "generated values: ";
print(v);
merge(v);
std::cout << "merged values: ";
print(v);
}
Live on Compiler Explorer
Thing is: While the example above produces the correct results, it is from what I can tell not conforming to the C++ standard. A BinaryPredicate
"shall not apply any non-constant function through the dereferenced iterators" http://eel.is/c++draft/algorithms.requirements#8.sentence-4 . Compare is a BinaryPredicate. http://eel.is/c++draft/alg.sorting#general-2.sentence-1 )
Does this mean that my only option is to roll a custom inplace_unique_reduce
or similar, or is there maybe an alternative elegant approach to this problem? I would prefer not having to write my own non-trivial algorithm for this.
Thanks
Assuming you are ok with additional allocations, I would use std::map
(or the std::unordered_map
):
auto merge_entries(std::vector<Value>& original_values) {
auto values = std::map<int, int>();
for (const auto [index, cost] : original_values) {
values[index] += cost;
}
const auto end_of_merged_values = std::transform(
values.cbegin(), values.cend(), original_values.begin(),
[](const auto entry) {
return Value{entry.first, entry.second};
}
);
original_values.erase(end_of_merged_values, original_values.end());
}
Apart from one for()
loop (which can be substituted with std::for_each
, although such change would introduce unnecessary boilterplate resulting in harder to read code, in my opinion), this solution uses only the STL.
We first merge all the entries using the map and then we overwrite some elements so that our original std::vector
holds the merged entries. What's super convenient is the fact that std::transform
returns an iterator pointing to the end of the inserted range. Why is it beneficial for us? Because apart from the unlikely scenario where no merging occurs, we have fewer elements compared to what was originally passed in. Using that iterator we can erase
the rest of the vector (nonoverwritten elements) keeping it clean, STL-like style.
Assuming you are not ok with additional allocations, but you are ok with streghtening your iterator requirements (to bidirectional), I would use std::partial_sum
and std::unique
:
template <class BiDirIt, class BinaryPredicateCompare, class BinaryOpReduce>
auto inplace_unique_reduce(
BiDirIt first, BiDirIt last,
BinaryPredicateCompare cmp,
BinaryOpReduce reduce
) {
std::partial_sum(
std::make_reverse_iterator(last), std::make_reverse_iterator(first),
std::make_reverse_iterator(last),
[cmp, reduce](auto acc, const auto& elem) {
if (cmp(acc, elem)) {
return reduce(acc, elem);
} else {
acc = elem;
}
return acc;
}
);
return std::unique(first, last, cmp);
}
used like so:
auto values = std::vector<Value>{
{1, 1}, {2, 2}, {2, 7}, {0, 5},
{3, 3}, {1, 2}, {3, 10}
};
auto comparator = [](const auto& lhs, const auto& rhs) {
return lhs.index == rhs.index;
};
auto reducer = [](const auto& lhs, const auto& rhs) {
return Value{lhs.index, lhs.cost + rhs.cost};
};
auto to_remove = inplace_unique_reduce(
values.begin(), values.end(),
comparator,
reducer
);
values.erase(to_remove, values.end());
for (const auto[index, cost] : values) {
std::cout << index << ' ' << cost << '\n';
}
Just like your original answer, this will not merge nonadjacent elements, but to do that you either have to sort them by index
or use something like map
, from the first part of my answer.
The std::make_reverse_iterator
calls are necessary becauase std::partial_sum
accumulates the merged element in the most right-hand side one of given group of consecutive, equivalent elements. std::unique
, on the other hand, preserves only the first element from such groups. Because of this, you want to merge the elements in the reverse order compared to the one you will be std::unique
-ing.
You raised some concerns about situations where copying or moving is expensive - in such cases, you are either left with your custom solutions that take into considerations your unique constraints, or you ease your constraints. Here we move-assign merged entries, but that's it for the potential bottlenecks. If your move assignment operator is expensive, I fear that no standard solution will work for you and you have to roll your own, like in your answer.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With