Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Shortest way to obtain a sublist of a sorted list of values lying in a certain interval

Tags:

c++

stl

Today I was asking myself what might be the shortest possible code to obtain all values in a sorted vector std::vector<double>, which are greater or equal to a and smaller or equal to b.

My first approach was something like the following:

#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>

// Returns all values in sortedValues being greater equal start and smaller equal end;
std::vector<double> cutValues(const std::vector<double>& sortedValues, double start, double end) {
    std::vector<double> ret;

    auto startIter=std::lower_bound(sortedValues.begin(), sortedValues.end(), start);
    auto stopIter = std::upper_bound(sortedValues.begin(), sortedValues.end(), end);
    std::copy(startIter, stopIter, std::back_inserter(ret));
    return ret;
}

int main(int argc, char **args) {
    {
        auto ret = cutValues({ 0.1,0.2,0.3 }, 0.1, 0.3);
        std::copy(ret.begin(), ret.end(), std::ostream_iterator<double>(std::cout, ","));
        std::cout << std::endl;
    }
    {
        auto ret = cutValues({ 0.12,0.2,0.31 }, 0.1, 0.3);
        std::copy(ret.begin(), ret.end(), std::ostream_iterator<double>(std::cout, ","));
        std::cout << std::endl;
    }
    {
        auto ret = cutValues({ 0.1,0.2,0.3 }, 0.2, 0.2);
        std::copy(ret.begin(), ret.end(), std::ostream_iterator<double>(std::cout, ","));
        std::cout << std::endl;
    }
}

My second thought was something very simple like the following:

std::vector<double> cutValues2(const std::vector<double>& sortedValues, double start, double end) {
    std::vector<double> ret;
    std::copy_if(sortedValues.begin(), sortedValues.end(), std::back_inserter(ret), [&start, &end](auto v) { return v >= start && v <= end; });
    return ret;
}

But considering the case of only removing small portions from a very large vector it might have some efficiency issues.

Now I'm asking myself, if there there are even better ways to do it?

like image 562
Aleph0 Avatar asked Aug 18 '17 07:08

Aleph0


3 Answers

A bit changed first version:

std::vector<double> cutValues(const std::vector<double>& sortedValues, double start, double end) {
    auto startIter = std::lower_bound(sortedValues.begin(), sortedValues.end(), start);
    auto stopIter = std::upper_bound(startIter, sortedValues.end(), end);
    return std::vector<double>(startIter, stopIter);
}
like image 55
DAle Avatar answered Nov 09 '22 16:11

DAle


Not shorter, but generally more efficient: use std::lower_bound to find the start of the "interesting" interval, and go on copying as long as the elements are less than your maximum value; that way you are locating quickly (O(log N)) the start even for big vectors and you don't waste time doing a binary search again looking for the end of the interval - you are going to find it anyway during the copy loop.

The possibly extra compares are virtually free anyway, as double compares themselves are extremely cheap, the branch is well predicted (it's always the same until the end of the copy) and depends on data already in cache; compare with an extra binary search, which "jumps around" in memory (worse cache locality) and has less predictable branches.

std::vector<double> cutValues(const std::vector<double>& sortedValues, double minv, double maxv) {
    std::vector<double> ret;
    auto end = sortedValues.end();
    for(auto it = std::lower_bound(sortedValues.begin(), end, minv);
        it != end && *it <= max; ++it) {
        ret.push_back(*it);
    }
    return ret;
}

This can be rewritten in a similar way to your cutValues2 using STL algorithms, but I'm not sure it's much of an improvement.

std::vector<double> cutValues(const std::vector<double>& sortedValues, double minv, double maxv) {
    std::vector<double> ret;
    std::copy_if(
        std::lower_bound(sortedValues.begin(), sortedValues.end(), minv),
        sortedValues.end(),
        std::back_inserter(ret),
        [=](double v) { return v <= maxv; });
    return ret;
}
like image 41
Matteo Italia Avatar answered Nov 09 '22 16:11

Matteo Italia


You could write your function STL-style and make it work with any container of any type (and it will be more elegant imho).

template <typename ForwardIt, typename T>
auto bounded_range(ForwardIt first, ForwardIt last, T lb, T ub) {
    return std::pair{std::lower_bound(first, last, lb), std::upper_bound(first, last, ub)};
}
std::vector<double> v1{0.1, 0.1, 0.2, 0.3, 0.3, 0.5, 0.7, 0.9};

auto [beg, end] = bounded_range(std::begin(v), std::end(v), 0.3, 0.5);
std::vector<double> v2{beg, end};

// Or using 'std::make_from_tuple'.
auto v3 = std::make_from_tuple<std::vector<double>>(bounded_range(std::begin(v), std::end(v), 0.3, 0.5));

Note: examples uses C++17 template argument deduction for class templates, and structured bindings.

like image 28
Felix Glas Avatar answered Nov 09 '22 15:11

Felix Glas