Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find index of first match using C++ regex

Tags:

c++

regex

split

I'm trying to write a split function in C++ using regexes. So far I've come up with this;

vector<string> split(string s, regex r)
{
    vector<string> splits;
    while (regex_search(s, r)) 
    {
        int split_on = // index of regex match
        splits.push_back(s.substr(0, split_on));
        s = s.substr(split_on + 1);
    }
    splits.push_back(s);
    return splits;
}

What I want to know is how to fill in the commented line.

like image 260
Maurdekye Avatar asked Jan 13 '15 17:01

Maurdekye


1 Answers

You'll need just a little more than that, but see the comments in the code below. The man trick is to use a match object, here std::smatch because you're matching on a std::string, to remember where you matched (not just that you did):

vector<string> split(string s, regex r)
{
  vector<string> splits;
  smatch m; // <-- need a match object

  while (regex_search(s, m, r))  // <-- use it here to get the match
  {
    int split_on = m.position(); // <-- use the match position
    splits.push_back(s.substr(0, split_on));
    s = s.substr(split_on + m.length()); // <-- also, skip the whole match
  }

  if(!s.empty()) {
    splits.push_back(s); // and there may be one last token at the end
  }

  return splits;
}

This can be used like so:

auto v = split("foo1bar2baz345qux", std::regex("[0-9]+"));

and will give you "foo", "bar", "baz", "qux".

std::smatch is a specialization of std::match_results, for which reference documentation exists here.

like image 80
Wintermute Avatar answered Sep 22 '22 01:09

Wintermute