Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I access all matches of a repeated capture group, not just the last one?

Tags:

c++

regex

boost

My code is:

#include <boost/regex.hpp>
boost::cmatch matches;
boost::regex_match("alpha beta", matches, boost::regex("([a-z])+"));
cout << "found: " << matches.size() << endl;

And it shows found: 2 which means that only ONE occurrence is found… How to instruct it to find THREE occurrences? Thanks!

like image 285
yegor256 Avatar asked Oct 15 '22 05:10

yegor256


2 Answers

You should not call matches.size() before verifying that something was matched, i.e. your code should look rather like this:

#include <boost/regex.hpp>
boost::cmatch matches;
if (boost::regex_match("alpha beta", matches, boost::regex("([a-z])+")))
    cout << "found: " << matches.size() << endl;
else
    cout << "nothing found" << endl;

The output would be "nothing found" because regex_match tries to match the whole string. What you want is probably regex_search that is looking for substring. The code below could be a bit better for you:

#include <boost/regex.hpp>
boost::cmatch matches;
if (boost::regex_search("alpha beta", matches, boost::regex("([a-z])+")))
    cout << "found: " << matches.size() << endl;
else
    cout << "nothing found" << endl;

But will output only "2", i.e. matches[0] with "alpha" and matches[1] with "a" (the last letter of alpha - the last group matched)

To get the whole word in the group you have to change the pattern to ([a-z]+) and call the regex_search repeatedly as you did in your own answer.

Sorry to reply 2 years late, but if someone googles here as I did, then maybe it will be still useful for him...

like image 136
David L. Avatar answered Oct 18 '22 13:10

David L.


This is what I've found so far:

text = "alpha beta";
string::const_iterator begin = text.begin();
string::const_iterator end = text.end();
boost::match_results<string::const_iterator> what;
while (regex_search(begin, end, what, boost::regex("([a-z]+)"))) {
    cout << string(what[1].first, what[2].second-1);
    begin = what[0].second;
}

And it works as expected. Maybe someone knows a better solution?

like image 35
yegor256 Avatar answered Oct 18 '22 15:10

yegor256