Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++ range-v3 library: 'take'-ing first 3 perfect numbers works and halts; 'take'-ing first 4 doesn't stop after 4

As I understand it, the range-v3 library's view operations (requires C++17 currently, but to become an official part of the STL in C++20) provides chainable STL-like algorithms that are lazily evaluated. As an experiment, I created the following code to evaluate the first 4 perfect numbers:

#include <iostream>
#include <range/v3/all.hpp>

using namespace std;
int main(int argc, char *argv[]) {
    auto perfects = ranges::view::ints(1)
                    | ranges::view::filter([] (int x) {
                        int psum = 0;
                        for (int y = 1; y < x; ++y) {
                            if (x % y == 0) psum += y;
                        }
                        return x == psum;})
                    | ranges::view::take(3);
    std::cout << "PERFECT NUMBERS:" << std::endl;
    for (int z : perfects) {
        std::cout << z << std::endl;
    } 
    std::cout << "DONE." << std::endl;
}

The code starts with a possible infinite range of numbers (ranges::view::ints(1)), but because the view algorithm ends with ranges::view::take(3) it should halt after finding the first three numbers passing the filter algorithm (a brute-force algorithm to filter out perfect numbers, intentionally not that efficient). Since the first three perfect numbers --- 6, 28, and 496 --- are fairly small, I expect this code to quickly find these, print "DONE." and terminate. And that's exactly what happens:

coliru -- taking 3 perfect numbers works just fine

However, let's say I want to print the first 4 perfect numbers, which are still all fairly small --- 6, 28, 496, and 8128. After printing 8128 the program does not halt and eventually has to be terminated; presumably it is vainly trying to compute the fifth perfect number, 33550336, which is beyond the ability of this brute-force algorithm to efficiently find.

coliru -- taking 4 perfect numbers tries to take 5+

This seems inconsistent to me. I would have understood if both tests had failed (concluding that I had misunderstood the lazy evaluation of range-v3's view algorithms), but the fact that take(3) succeeds and halts while take(4) does not seems like a bug to me, unless I'm misunderstanding things.

I've tried this with several compilers on wandbox and it seems to be persistent (tried clang 6.0.1 and 7.0.0, g++ 8.1.0 and 8.2.0). At least on my local computer, where I found the issue originally, version 0.3.6 of range-v3 is being used, but I'm not sure about coliru and wandbox.

wandbox link

like image 526
jwimberley Avatar asked Nov 16 '18 16:11

jwimberley


2 Answers

A take view that contains n elements has n + 1 valid iterator values: n that correspond to elements in the range, and the n + 1st past-the-end iterator. It is intended that iterating over the take view necessarily forms each of those n + 1 iterators - indeed, it's useful to extract the underlying iterator value adapted by the take view's end iterator to perform additional computations.

take_view doesn't know that the range it's adapting is a filter, or that your filter predicate is inordinately expensive - it simply assumes that your predicate is O(1) as is necessary for it to provide O(1) iterator operations. (Although we did forget to make that complexity requirement explicit in C++20.) This case is a very good example of why we have complexity requirements: if the iterators of the range being adapted don't meet the Standard's O(1) complexity requirements, the view can't meet its complexity guarantees and reasoning about performance becomes impossible.

like image 57
Casey Avatar answered Nov 19 '22 08:11

Casey


Apology:

I'm (partly) answering my own question because I think I've learned what is going on here, mechanically, and because the extra detail won't fit into a comment. I'm not sure of the etiquette, so if this would be better as an edit to the question --- there's still the open question of why the library is designed this way --- please suggest that in the comments I'll happily move it there.

Filtering until finding an end iterator

I don't understand the internals of range-v3 in great detail so I might not have terminology exactly right. In short, there is no inconsistent behavior here. When a call to ranges::view::take follows a call to ranges::view::filter (or ranges::view::remove_if), the resulting view object must set an end iterator at some point during the iteration to break out of the for-loop. If I'd thought about it, I would have imagined that the ranged-based for loop still expands to something like

for (auto it = std::begin(perfects); it != std::end(perfects); ++it) {
    ...
}

(which, btw, behaves identically in my examples) and that after it has found the required number of elements, at the beginning of the subsequent operator++ call on it, there would hbe special logic to make the result equal to std::end(perfects), so that the loop exits without doing any additional work. But instead, and this makes some sense from an implementation standpoint, the end iterator actually corresponds to the next element returned by the filter/remove_if view. The filter predicate keeps on looping over ranges::view::ints(1) until it finds one for which the predicate returns true; presumably this becomes the end iterator, since it is not printed in the ranged for loop.

An easy demonstration of this is provided by the following code. Here, there are two configurable integers n and m, and the predicate function in filter returns true for x <= n, false for n < x < n+m, and true for x >= m:

#include <iostream>
#include <range/v3/all.hpp>

using namespace std;
int main(int,char**) {
    int n = 5;
    int m = 3;
    auto perfects = ranges::view::ints(1)
                    | ranges::view::filter([&n,&m] (int x) {
                        std::cout << "Checking " << x << "... ";
                        if (x <= n) {
                            return true;
                        } else if (x <= n + m) {
                            std::cout << std::endl;
                            return false;
                        }
                        return true;})
                    | ranges::view::take(n);
    std::cout << "First " << n << " numbers:" << std::endl;
    for (int z : perfects) {
        std::cout << " take it!" << std::endl;
    }
    std::cout << "DONE." << std::endl;
}

You can run this code for different values of n and m here: wandbox. The default output is as follows:

First 5 numbers:
Checking 1...  take it!
Checking 2...  take it!
Checking 3...  take it!
Checking 4...  take it!
Checking 5...  take it!
Checking 6... 
Checking 7... 
Checking 8... 
Checking 9... DONE.

(I didn't rename the variable perfects; clearly it is not a set of perfect numbers anymore). Even after taking the first n successes, the lambda predicate is called until it returns true. Since the integer that returns true, 9, is not printed, it must be the std::end(perfects) that breaks the ranged for-loop.

The remaining mystery to me is why it does this. It's not what I would have expected; it could lead to unexpected behavior (e.g. if lambda function body isn't pure and alters captured objects) and it could have big performance implications, as demonstrated by the original example, which would have to perform roughly 10^15 modulo operations before reaching the integer 33550336.

like image 5
jwimberley Avatar answered Nov 19 '22 09:11

jwimberley