Why does GCC -O3 cause infinite std::distance with filter iterators over a std::deque?

Question

After much pain and misery, I've tracked down some very odd behaviour where std::distance never returns when given a range of boost::filter_iterators over a std::deque. It appears the problem is unique to GCC (6.1+) with -O3 optimisations. Here is an example demonstrating the offending behaviour:

#include <string>
#include <deque>
#include <iterator>
#include <iostream>

#include <boost/iterator/filter_iterator.hpp>

struct Foo
{
    std::string bar, s = "";
    char a = '\0';
};

int main()
{
    const std::deque<Foo> foos(14, {""});
    const std::string test {};
    const auto p = [test] (const auto& foo) { return foo.bar == test; };
    using boost::make_filter_iterator;
    const auto begin = make_filter_iterator(p, std::cbegin(foos), std::cend(foos));
    const auto end   = make_filter_iterator(p, std::cend(foos), std::cend(foos));
    std::cout << std::distance(begin, end) << std::endl;
}

Some observations:

GCC with optimisations -O2 or less returns as expected.
Clang (3.8) returns the correct answer with any optimisation level.
Changing std::deque to std::vector or std::list results in expected behaviour.
The 14 is critical; anything less and the problem disappears.
The sizeof(Foo) is important; removing s or a makes the problem go away.
Capturing test by reference, or just comparing to a constant expression (e.g. foo.bar == " ") results in normal behaviour.
There are no compiler warnings (with -Wall -Wextra -pedantic).
Valgrind reports no errors.
Use fsanitize=undefined and the problem goes away.

What's going on?

João Amaral · Accepted Answer

I think that these findings below may be useful for both improve the bug report and also for using in your code to workaround the problem.

By debugging the optimized output and playing with optimization flags and with minor code changes I've reached the conclusion about the specific optimization flags that are causing the error.

The set of options are:

-O -fno-auto-inc-dec -fno-branch-count-reg -fno-combine-stack-adjustments -fno-compare-elim -fno-cprop-registers -fno-dce -fno-defer-pop -fno-delayed-branch -fno-dse -fno-forward-propagate -fno-guess-branch-probability -fno-if-conversion2 -fno-if-conversion -fno-inline-functions-called-once -fno-ipa-pure-const -fno-ipa-profile -fno-ipa-reference -fno-merge-constants -fno-move-loop-invariants -fno-reorder-blocks -fno-shrink-wrap -fno-split-wide-types -fno-ssa-backprop -fno-ssa-phiopt -fno-tree-bit-ccp -fno-tree-ccp -fno-tree-ch -fno-tree-coalesce-vars -fno-tree-phiprop -fno-tree-sink -fno-tree-slsr -fno-tree-dse -fno-tree-forwprop -fno-tree-fre -fno-unit-at-a-time -fno-tree-ter -fno-tree-sra -fno-tree-copy-prop -fstrict-aliasing -ftree-slp-vectorize -std=c++14

Sorry for that long set but what I actually wanted was something like: -O0 -ftree-copy-prop -ftree-pta -ftree-dce -fstrict-aliasing -ftree-slp-vectorize (I've tried with -Og also) plus the magic steps of O1...

Note that just -O3 -f-no-tree-slp-vectorize will already fix the behavior, but by using the full options I've sent the debugging is almost easy...

Furthermore, it looks like the existence of the operator ==(string, string) is generating the confusion in the compiler.

If you examine the code pasted bellow where all the commented by #if 0 code, when activated works in the place of the original code, you may find the problem where I didn't.

Note that the ==() operator is not even called because the foo.a != '\0' is always true in the test. Therefore it looks like it's existence is making the compiler to generate the bad code.

Note also that any of the commented code inside the loop also changing the behavior to the expected one and that's why I suspected of the vectorization flags for starters.

#include <string>
#include <deque>
#include <iterator>
#include <iostream>

#include <boost/iterator/filter_iterator.hpp>
#include <string.h>

struct Foo
{
    std::string bar, s = "";
    char a = 'n';
};

std::ostream& operator<<(std::ostream& os, const Foo& f)
{
    os << f.bar << '/' << f.a;
    return os;
}

int main()
{
    std::deque<Foo> foos(14, {"abc"});
    const std::string test {"abc"};
    Foo other;
    other.bar = "last"; other.a = 'l';
    foos.push_back(other);
    other.bar = "first";
    other.a = 'f';
    foos.push_front(other);
    //
#if 0
    const auto p = [test] (const auto& foo) { return foo.a != '\0'; };
#elif 0
    const auto p = [test] (const auto& foo) {
        bool  rc =  (foo.a != '\0');
        if (!rc)
            rc = (foo.bar == std::string(test));
        return rc;
    };
#elif 1
    const auto p = [test] (const auto& foo) {
        bool  rc =  (foo.a != '\0');
        if (!rc)
            rc = (foo.bar == test);
        return rc;
    };
#endif
    using boost::make_filter_iterator;
    const auto begin = make_filter_iterator(p, std::cbegin(foos), std::cend(foos));
    const auto end   = make_filter_iterator(p, std::cend(foos), std::cend(foos));
    std::cout << std::distance(end, end) << std::endl;
    std::cout << std::distance(begin, begin) << std::endl;
    std::cout << std::distance(std::cbegin(foos), std::cend(foos)) << std::endl;

    auto __first = begin;
    auto __last = end;

    int __n = 0;
    //std::cout << __last << std::endl;
    //std::deque<char> trace;
    //Foo trace[21];
    const int max = foos.size();
    char trace[max+5]; memset(trace, 'c', sizeof(trace));

    std::cout << max << std::endl;
    std::cout << *__last << std::endl;

    while (__first != __last)
    {
        trace[__n] = (*__first).a;
        //trace[__n] = (*__first);
        //trace.push_back((*__first).a);
        //std::cout << *__first << std::endl;
        ++__n;
        ++__first;
        if (__n > max + 5)
            break;
        //std::cout << __n << std::endl;
        //std::cout << (__first != __last) << std::endl;
    }

    for (auto f: trace)
        std::cout << f  << std::endl;
    std::cout << "Tadaaaaa: " <<  __n << std::endl;

    //std::cout << std::distance(begin, end) << std::endl;

}

Daniel · Answer

This behaviour was due to a GCC bug caused by bad vectorisation optimisation. A fix has now been issued, which should appear in GCC 6.3.

For those stuck with GCC 5.4 - 6.2, the compiler option -fno-tree-slp-vectorize will 'fix' the issue.

Why does GCC -O3 cause infinite std::distance with filter iterators over a std::deque?

Tags:

c++

gcc

c++11

deque

filter-iterator

Daniel

2 Answers

João Amaral

Daniel

Recent Activity

Donate For Us

Why does GCC -O3 cause infinite std::distance with filter iterators over a std::deque?

Tags:

c++

gcc

c++11

deque

filter-iterator

Daniel

2 Answers

João Amaral

Daniel

Related questions

Recent Activity

Donate For Us