Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I loop through results from std::regex_search?

Tags:

c++

regex

c++11

stl

After calling std::regex_search, I'm only able to get the first string result from the std::smatch for some reason:

Expression.assign("rel=\"nofollow\">(.*?)</a>");
if (std::regex_search(Tables, Match, Expression))
{
    for (std::size_t i = 1; i < Match.size(); ++i)
        std::cout << Match[i].str() << std::endl;
}

So I tried to do it another way - with an iterator:

const std::sregex_token_iterator End;
Expression.assign("rel=\"nofollow\">(.*?)</a>");
for (std::sregex_token_iterator i(Tables.begin(), Tables.end(), Expression); i != End; ++i)
{
    std::cout << *i << std::endl;
}

This does go through every match, but it also gives me the whole matching string instead of just the capture that I was after. Surely must be another way than having to do another std::regex_search on the iterator element in the loop?

Thanks in advance.

like image 289
Nop Avatar asked Sep 04 '11 19:09

Nop


2 Answers

regex_token_iterator takes an optional fourth argument specifying which submatch is returned for each iteration. The default value of this argument is 0, which in case of the C++ (and many other) regexes means "the whole match". If you want to get the first captured submatch, simply pass 1 to the constructor:

const std::sregex_token_iterator End;
Expression.assign("rel=\"nofollow\">(.*?)</a>");
for (std::sregex_token_iterator i(Tables.begin(), Tables.end(), Expression, 1); i != End; ++i)
{
    std::cout << *i << std::endl; // *i only yields the captured part
}
like image 181
JohannesD Avatar answered Nov 08 '22 09:11

JohannesD


std::regex_search searches for the regex just once. It does not return a list of matches, but a list of submatched expressions (those within parentheses). This is why you only get one Match[1], the text inside the link tag.

As for the second code, it actually returns you all the matches, but it returns you again match_results object, so you have to use the [] operator:

const std::sregex_iterator End;
Expression.assign("rel=\"nofollow\">(.*?)</a>");
for (std::sregex_iterator i(Tables.begin(), Tables.end(), Expression); i != End; ++i)
{
    std::cout << (*i)[1] << std::endl; // first submatch, same as above.
}
like image 45
Diego Sevilla Avatar answered Nov 08 '22 08:11

Diego Sevilla