After much pain and misery, I've tracked down some very odd behaviour where std::distance
never returns when given a range of boost::filter_iterator
s over a std::deque
. It appears the problem is unique to GCC (6.1+) with -O3
optimisations. Here is an example demonstrating the offending behaviour:
#include <string>
#include <deque>
#include <iterator>
#include <iostream>
#include <boost/iterator/filter_iterator.hpp>
struct Foo
{
std::string bar, s = "";
char a = '\0';
};
int main()
{
const std::deque<Foo> foos(14, {""});
const std::string test {};
const auto p = [test] (const auto& foo) { return foo.bar == test; };
using boost::make_filter_iterator;
const auto begin = make_filter_iterator(p, std::cbegin(foos), std::cend(foos));
const auto end = make_filter_iterator(p, std::cend(foos), std::cend(foos));
std::cout << std::distance(begin, end) << std::endl;
}
Some observations:
-O2
or less returns as expected.std::deque
to std::vector
or std::list
results in expected behaviour.14
is critical; anything less and the problem disappears.sizeof(Foo)
is important; removing s
or a
makes the problem go away.test
by reference, or just comparing to a constant expression (e.g. foo.bar == " "
) results in normal behaviour. -Wall -Wextra -pedantic
).fsanitize=undefined
and the problem goes away.What's going on?
I think that these findings below may be useful for both improve the bug report and also for using in your code to workaround the problem.
By debugging the optimized output and playing with optimization flags and with minor code changes I've reached the conclusion about the specific optimization flags that are causing the error.
The set of options are:
-O -fno-auto-inc-dec -fno-branch-count-reg -fno-combine-stack-adjustments -fno-compare-elim -fno-cprop-registers -fno-dce -fno-defer-pop -fno-delayed-branch -fno-dse -fno-forward-propagate -fno-guess-branch-probability -fno-if-conversion2 -fno-if-conversion -fno-inline-functions-called-once -fno-ipa-pure-const -fno-ipa-profile -fno-ipa-reference -fno-merge-constants -fno-move-loop-invariants -fno-reorder-blocks -fno-shrink-wrap -fno-split-wide-types -fno-ssa-backprop -fno-ssa-phiopt -fno-tree-bit-ccp -fno-tree-ccp -fno-tree-ch -fno-tree-coalesce-vars -fno-tree-phiprop -fno-tree-sink -fno-tree-slsr -fno-tree-dse -fno-tree-forwprop -fno-tree-fre -fno-unit-at-a-time -fno-tree-ter -fno-tree-sra -fno-tree-copy-prop -fstrict-aliasing -ftree-slp-vectorize -std=c++14
Sorry for that long set but what I actually wanted was something like: -O0 -ftree-copy-prop -ftree-pta -ftree-dce -fstrict-aliasing -ftree-slp-vectorize
(I've tried with -Og also) plus the magic steps of O1...
Note that just -O3 -f-no-tree-slp-vectorize
will already fix the behavior, but by using the full options I've sent the debugging is almost easy...
Furthermore, it looks like the existence of the operator ==(string, string)
is generating the confusion in the compiler.
If you examine the code pasted bellow where all the commented by #if 0 code, when activated works in the place of the original code, you may find the problem where I didn't.
Note that the ==()
operator is not even called because the foo.a != '\0'
is always true in the test. Therefore it looks like it's existence is making the compiler to generate the bad code.
Note also that any of the commented code inside the loop also changing the behavior to the expected one and that's why I suspected of the vectorization flags for starters.
#include <string>
#include <deque>
#include <iterator>
#include <iostream>
#include <boost/iterator/filter_iterator.hpp>
#include <string.h>
struct Foo
{
std::string bar, s = "";
char a = 'n';
};
std::ostream& operator<<(std::ostream& os, const Foo& f)
{
os << f.bar << '/' << f.a;
return os;
}
int main()
{
std::deque<Foo> foos(14, {"abc"});
const std::string test {"abc"};
Foo other;
other.bar = "last"; other.a = 'l';
foos.push_back(other);
other.bar = "first";
other.a = 'f';
foos.push_front(other);
//
#if 0
const auto p = [test] (const auto& foo) { return foo.a != '\0'; };
#elif 0
const auto p = [test] (const auto& foo) {
bool rc = (foo.a != '\0');
if (!rc)
rc = (foo.bar == std::string(test));
return rc;
};
#elif 1
const auto p = [test] (const auto& foo) {
bool rc = (foo.a != '\0');
if (!rc)
rc = (foo.bar == test);
return rc;
};
#endif
using boost::make_filter_iterator;
const auto begin = make_filter_iterator(p, std::cbegin(foos), std::cend(foos));
const auto end = make_filter_iterator(p, std::cend(foos), std::cend(foos));
std::cout << std::distance(end, end) << std::endl;
std::cout << std::distance(begin, begin) << std::endl;
std::cout << std::distance(std::cbegin(foos), std::cend(foos)) << std::endl;
auto __first = begin;
auto __last = end;
int __n = 0;
//std::cout << __last << std::endl;
//std::deque<char> trace;
//Foo trace[21];
const int max = foos.size();
char trace[max+5]; memset(trace, 'c', sizeof(trace));
std::cout << max << std::endl;
std::cout << *__last << std::endl;
while (__first != __last)
{
trace[__n] = (*__first).a;
//trace[__n] = (*__first);
//trace.push_back((*__first).a);
//std::cout << *__first << std::endl;
++__n;
++__first;
if (__n > max + 5)
break;
//std::cout << __n << std::endl;
//std::cout << (__first != __last) << std::endl;
}
for (auto f: trace)
std::cout << f << std::endl;
std::cout << "Tadaaaaa: " << __n << std::endl;
//std::cout << std::distance(begin, end) << std::endl;
}
This behaviour was due to a GCC bug caused by bad vectorisation optimisation. A fix has now been issued, which should appear in GCC 6.3.
For those stuck with GCC 5.4 - 6.2, the compiler option -fno-tree-slp-vectorize
will 'fix' the issue.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With