Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why C++ ranges "transform -> filter" calls transform twice for values that match the filter's predicate?

Tags:

c++

c++20

Consider the following code using the ranges library (from c++20)

#include <iostream>
#include <ranges>
#include <vector>

int main() {
    std::vector<int> inputs{1, 2, 3, 4, 5, 6};

    auto square_it = [](auto i) {
        std::cout << i << std::endl;
        return i * 2; };

    auto results = inputs | std::views::transform(square_it) | std::views::filter([](auto i){ return i % 3 == 0; });

    for(auto r : results) {
        // std::cout << r << std::endl;
    }
}

The cout in the square function is to log when the square function is called by the ranges library. This code prints

1
2
3
3
4
5
6
6

The question is, why are values that match the filter's predicated are printed twice?

I have seem this code in a presentation in CppCon 2020, where the presenter explains why this happens. According to him, filter iterates until its predicate is satisfied (and of course if needs to call transform each time). Then filter stops and it is ready to be iterated on. After that the actual iteration is started and a value is read from filter, which then calls transform again a second time for the same input.

It is not clear to me why this is necessary. Since ranges::views compute values lazily and every view operation pulls data from the one before it, why can't filter just pass the value to whoever is after it in the pipeline as soon as it finds a match?

like image 511
darcamo Avatar asked Oct 04 '20 20:10

darcamo


People also ask

What is the filter transform in a chart?

The filter transform removes objects from a data stream based on a provided filter expression, selection, or other filter predicate. A filter can be added at the top level of a chart using the Chart.transform_filter()method. The argument to transform_filtercan be one of a number of

What is a filter in functional programming?

In functional programming, filter is a higher-order function that processes a data structure (usually a list) in some order to produce a new data structure containing exactly those elements of the original data structure for which a given predicate returns the boolean value true.

What is the argument to transform_filter?

The argument to transform_filter can be one of a number of expressions and objects: A Field predicate, such as FieldOneOfPredicate , FieldRangePredicate, FieldEqualPredicate , FieldLTPredicate, FieldGTPredicate , FieldLTEPredicate, FieldGTEPredicate,

What is the filter transform in Salesforce?

The filter transform removes objects from a data stream based on a provided filter expression, selection, or other filter predicate. A filter can be added at the top level of a chart using the Chart.transform_filter () method.


1 Answers

why can't filter just pass the value to whoever is after it in the pipeline as soon as it finds a match?

Because in the iterator model, positioning and accessing are distinct operations. You position an iterator with ++; you access an iterator with *. These are two distinct expressions, which are evaluated at two distinct times, resulting in two distinct function calls that yield two distinct values (++ yields an iterator, * yields a reference).

A filtering iterator, in order to perform its iteration operation, must access the values of its underlying iterator. But that access cannot be communicated to the caller of ++ because that caller only asked to position the iterator, not to get its value. The result of positioning an iterator is a new iterator value, not the value stored in that iterated position.

So there's nobody to return it to.

You can't really delay positioning until after accessing because a user might reposition the iterator multiple times. I mean, you could implement it that way in theory by storing the number of such increments/decrements. But this increases the complexity of the iterator's implementation. Especially since resolving such delayed positioning can happen through something as simple as testing against another iterator or sentinel, which is supposed to be an O(1) operation.

This is simply a limitation of the model of iterators as having both position and value. The iterator model was designed as an abstraction of pointers, where iteration and access are distinct operations, so it inherited this mechanism. Alternative models exist where iteration and access are bundled together, but they're not how standard library iteration works.

like image 161
Nicol Bolas Avatar answered Oct 06 '22 01:10

Nicol Bolas