In the following code (gcc 10.2.1), the call to regex_match
returns 'no match', which I believe is correct.
sm.size()
returns 0
, but when iterating from sm.begin()
to end()
, it finds 3 occurrences (all empty strings).
If this is correct, what do these 3 finds mean ?
But since size()==0
, shouldn't begin() == end()
?
Edit: Based on comments, I added the ready
flag to the output
#include <iostream>
#include <string>
#include <regex>
#include <assert.h>
int main()
{
std::string input("4321");
std::regex rg("^([0-9])");
std::smatch sm;
bool found = std::regex_match(input, sm, rg);
assert(!sm.size() == sm.empty());
std::cout << "ready: " << sm.ready() << ", found: " <<
found << ", size: " << sm.size() << std::endl;
for (auto it = sm.begin(); it != sm.end(); ++it)
{
std::cout << "iterate '" << *it << "'\n";
}
}
output:
ready: 1, found: 0, size: 0
iterate ''
iterate ''
iterate ''
In GCC's implementation of match_results
the prefix, suffix, and unmatched string are stored at the end of the sequence managed by the match_results
object (which is implemented as a private std::vector
base class). Those extra elements should not be visible when iterating from begin()
to end()
, but the end()
function is returning the wrong position. It's returning an iterator to the end of the vector, after the three extra elements. It should be returning an iterator just before those, which would be equal to begin()
.
This is a bug, obviously. I'll fix it.
The fix is:
const_iterator
end() const noexcept
- { return _Base_type::end() - (empty() ? 0 : 3); }
+ { return _Base_type::end() - (_Base_type::empty() ? 0 : 3); }
🤦
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With