Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C++11 Regex Find Capture Group Identifier

Tags:

c++

regex

c++11

I've looked at a number of sources for C++11's new regex library, but most of them focus more on the syntax, or the more basic usage of things like regex_match, or regex_search. While these articles helped me get started using the regex library, I'm having a difficult time finding more details on capture groups.

What I'm trying to accomplish, is find out which capture group a match belongs to. So far, I've only found a single way to do this.

#include <iostream>
#include <string>
#include <regex>

int main(int argc, char** argv)
{
    std::string input = "+12 -12 -13 90 qwerty";
    std::regex pattern("([+-]?[[:digit:]]+)|([[:alpha:]]+)");

    auto iter_begin = std::sregex_token_iterator(input.begin(), input.end(), pattern, 1);
    auto iter_end = std::sregex_token_iterator();

    for (auto it = iter_begin; it != iter_end; ++it)
    {
        std::ssub_match match = *it;
        std::cout << "Match: " << match.str() << " [" << match.length() << "]" << std::endl;
    }

    std::cout << std::endl << "Done matching..." << std::endl;
    std::string temp;
    std::getline(std::cin, temp);

    return 0;
}

In changing the value of the fourth argument of std::sregex_token_iterator, I can control which submatch it will keep, telling it to throw away the rest of them. Therefore, to find out which capture group a match belongs to, I can simply iterate through the capture groups to find out which matches are not thrown away for a particular group.

However, this would be undesirable for me, because unless there's some caching going on in the background I would expect each construction of std::sregex_token_iterator to pass over the input and find the matches again (someone please correct me if this is wrong, but this is the best conclusion I could come to).

Is there any better way of finding the capture group(s) a match belongs to? Or is iterating over the submatches the best course of action?

like image 363
TheCodeBroski Avatar asked Feb 02 '13 17:02

TheCodeBroski


Video Answer


1 Answers

Use regex_iterator instead. You will have access to match_results for each match, which contains all the sub_matches, where you can check which of the capturing group the match belongs to.

like image 91
nhahtdh Avatar answered Oct 07 '22 15:10

nhahtdh