Tring to get C++ regex string capture to work. I have tried all four combinations of Windows vs. Linux, Boost vs. native C++ 0x11. The sample code is:
#include <string>
#include <iostream>
#include <boost/regex.hpp>
//#include <regex>
using namespace std;
using namespace boost;
int main(int argc, char** argv)
{
smatch sm1;
regex_search(string("abhelloworld.jpg"), sm1, regex("(.*)jpg"));
cout << sm1[1] << endl;
smatch sm2;
regex_search(string("hell.g"), sm2, regex("(.*)g"));
cout << sm2[1] << endl;
}
The closest that works is g++ (4.7) with Boost (1.51.0). There, the first cout outputs the expected abhelloworld.
but nothing from the second cout.
g++ 4.7 with -std=gnu++11 and <regex>
instead of <boost/regex.hpp>
produces no output.
Visual Studio 2012 using native <regex>
yields an exception regarding incompatible string iterators.
Visual Studio 2008 with Boost 1.51.0 and <boost/regex.hpp>
yields an exception regarding "Standard C++ Libraries Invalid argument".
Are these bugs in C++ regex, or am I doing something wrong?
A regular expression is a sequence of characters used to match a pattern to a string. The expression can be used for searching text and validating input. Remember, a regular expression is not the property of a particular language. POSIX is a well-known library used for regular expressions in C.
REG_EXTENDED. Treat the pattern as an extended regular expression, rather than as a basic regular expression. REG_ICASE. Ignore case when matching letters.
The Match(String, String, RegexOptions) method returns the first substring that matches a regular expression pattern in an input string. For information about the language elements used to build a regular expression pattern, see Regular Expression Language - Quick Reference.
Regular expressions allow us to not just match text but also to extract information for further processing. This is done by defining groups of characters and capturing them using the special parentheses ( and ) metacharacters. Any subpattern inside a pair of parentheses will be captured as a group.
Are these bugs in C++ regex, or am I doing something wrong?
At the time of your posting, gcc didn't support <regex>
as noted in the other answer (it does now). As for the other problems, your problem is you are passing temporary string objects. Change your code to the following:
smatch sm1;
string s1("abhelloworld.jpg");
regex_search(s1, sm1, regex("(.*)jpg"));
cout << sm1[1] << endl;
smatch sm2;
string s2("hell.g");
regex_search(s2, sm2, regex("(.*)g"));
cout << sm2[1] << endl;
Your original example compiles because regex_search
takes a const reference which temporary objects can bind to, however, smatch
only stores iterators into your temporary object which no longer exists. The solution is to not pass temporaries.
If you look in the C++ standard at [§ 28.11.3/5], you will find the following:
Returns: The result of regex_search(s.begin(), s.end(), m, e, flags).
What this means is that internally, only iterators to your passed in string are used, so if you pass in a temporary, iterators to that temporary object will be used which are invalid and the actual temporary itself is not stored.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With