I'm trying to extract submatches from a text file using boost regex. Currently I'm only returning the first valid line and the full line instead of the valid email address. I tried using the iterator and using submatches but I wasn't having success with it. Here is the current code:
if(Myfile.is_open()) {
boost::regex pattern("^[_a-z0-9-]+(\.[_a-z0-9-]+)*@[a-z0-9-]+(\.[a-z0-9-]+)*(\.[a-z]{2,4})$");
while(getline(Myfile, line)) {
string::const_iterator start = line.begin();
string::const_iterator end = line.end();
boost::sregex_token_iterator i(start, end, pattern);
boost::sregex_token_iterator j;
while ( i != j) {
cout << *i++ << endl;
}
Myfile.close();
}
Use boost::smatch.
boost::regex pattern("what(ever) ...");
boost::smatch result;
if (boost::regex_search(s, result, pattern)) {
string submatch(result[1].first, result[1].second);
// Do whatever ...
}
const string pattern = "(abc)(def)";
const string target = "abcdef";
boost::regex regexPattern(pattern, boost::regex::extended);
boost::smatch what;
bool isMatchFound = boost::regex_match(target, what, regexPattern);
if (isMatchFound)
{
for (unsigned int i=0; i < what.size(); i++)
{
cout << "WHAT " << i << " " << what[i] << endl;
}
}
The output is the following
WHAT 0 abcdef
WHAT 1 abc
WHAT 2 def
Boost uses parenthesized submatches, and the first submatch is always the full matched string. regex_match has to match the entire line of input against the pattern, if you are trying to match a substring, use regex_search instead.
The example I used above uses the posix extended regex syntax, which is specified using the boost::regex::extended parameter. Omitting that parameter changes the syntax to use perl style regex syntax. Other regex syntax is available.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With