I'm trying to write a split function in C++ using regexes. So far I've come up with this;
vector<string> split(string s, regex r)
{
vector<string> splits;
while (regex_search(s, r))
{
int split_on = // index of regex match
splits.push_back(s.substr(0, split_on));
s = s.substr(split_on + 1);
}
splits.push_back(s);
return splits;
}
What I want to know is how to fill in the commented line.
You'll need just a little more than that, but see the comments in the code below. The man trick is to use a match object, here std::smatch
because you're matching on a std::string
, to remember where you matched (not just that you did):
vector<string> split(string s, regex r)
{
vector<string> splits;
smatch m; // <-- need a match object
while (regex_search(s, m, r)) // <-- use it here to get the match
{
int split_on = m.position(); // <-- use the match position
splits.push_back(s.substr(0, split_on));
s = s.substr(split_on + m.length()); // <-- also, skip the whole match
}
if(!s.empty()) {
splits.push_back(s); // and there may be one last token at the end
}
return splits;
}
This can be used like so:
auto v = split("foo1bar2baz345qux", std::regex("[0-9]+"));
and will give you "foo", "bar", "baz", "qux"
.
std::smatch
is a specialization of std::match_results
, for which reference documentation exists here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With