Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

regex_token_iterator *it++ bug?

Tags:

c++

regex

c++11

for following code:

#include<iostream>
#include<regex>

using namespace std;

int main(int argc, char *argv[]) 
{
    regex reg("/");
    string s = "Split/Values/Separated/By/Slashes";
    sregex_token_iterator it{std::begin(s), std::end(s), reg, -1};
    sregex_token_iterator end;

    while(it != end)
    {
        cout << *it++ << endl;
    }

    return 0;
}

should output:

Split
Values
Separated
By
Slashes

but it outputs this:

Values
Separated
By

Slashes

the main code may be problem is *it++, if I write cout << *it << endl;++it;,it work right.

when I change the stand c++11 regex to boost-regex, *it++ also work right.

I have check the head of regex, I think the operator++(int) function has no problem.

my clang version is

Apple LLVM version 5.0 (clang-500.2.79) (based on LLVM 3.3svn) 
Target: x86_64-apple-darwin13.0.0
Thread model: posix

Is there anyone have such problem?

Is there a bug in clang?

like image 429
user2763477 Avatar asked Dec 02 '13 08:12

user2763477


1 Answers

I found that it's a libc++ implementation bug.

Go to regex, insert the following two lines

    regex_token_iterator operator++(int)
    {
        regex_token_iterator __t(*this);
std::cout << "test---" << *__t << "---test" << endl;
        ++(*this);
std::cout << "test---" << *__t << "---test" << endl;
        return __t;
    }

you find that the value of *__t changed after ++(*this) !

Further dig into you will find that,

*__t is actually implemented by returning internal value_type pointer _result, while _result actually points to &_position->prefix(), which is the address of match_results' _prefix object, the address of this object never changed, but the content of it changed.

like image 66
user534498 Avatar answered Oct 06 '22 12:10

user534498