I have a function which is attempting to match a given string against a given regex pattern. If it does not match, it should create a string indicating such occurrence and include the regex pattern it failed and the content of the string. Something similar to such:
bool validate_content(const std::string & str, const std::regex & pattern, std::vector<std::string> & errors)
{
if ( false == std::regex_match(str, pattern) )
{
std::stringstream error_str;
// error_str << "Pattern match failure: " << pattern << ", content: " << str;
errors.push_back(error_str.str());
return false;
}
return true;
}
However as you can see, the commented-out line presents a challenge: is it possible to recover the original pattern of the regex object?
There is obviously a workaround of providing the original pattern string (instead of or alongside) the regex object and then using that. But I would have of course preferred to not need to include the extra work to either recreate the regex object every time this function is called (biting cost in reparsing the pattern every time the function is called) or to pass the regex pattern along with the regex object (prone to typos and errors unless I provide a wrapper which does that for me, which is not as convenient).
I'm using GCC 4.9.2 on Ubuntu 14.04.
boost::basic_regex
objects have a str()
function which returns a (copy of) the character string used to construct the regular expression. (They also provide begin()
and end()
interfaces which return iterators to the character sequence, as well as a mechanism for introspecting capture subexpressions.)
These interfaces were in the initial TR1 regex standardization proposal, but were removed in 2003, after the adoption of n1499: Simplifying Interfaces in basic_regex, from which I quote:
basic_regex Should Not Keep a Copy of its Initializer
The
basic_regex
template has a member functionstr
which returns a string object that holds the text used to initialize thebasic_regex
object… While it might occasionally be useful to look at the initializer string, we ought to apply the rule that you don't pay for it if you don't use it. Just asfstream
objects don't carry around the file name that they were opened with,basic_regex
objects should not carry around their initializer text. If someone needs to keep track of that text they can write a class that holds the text and thebasic_regex
object.
According to the standard N4431 §28.8/2 Class template basic_regex [re.regex] (Emphasis mine):
Objects of type specialization of
basic_regex
are responsible for converting the sequence ofcharT
objects to an internal representation. It is not specified what form this representation takes, nor how it is accessed by algorithms that operate on regular expressions. [ Note: Implementations will typically declare some function templates as friends ofbasic_regex
to achieve this — end note ]
Thus, the basic_regex
object is not required to keep internally the original character sequence.
Consequently, you must store the sequence of characters upon the creation of the regex
. For example:
struct RegexPattern {
std::string pattern;
std::regex reg;
};
...
bool validate_content(const std::string & str, const RegexPattern & pattern, std::vector<std::string> & errors) {
if(false == std::regex_match(str, pattern.reg)) {
std::stringstream error_str;
error_str << "Pattern match failure: " << pattern.pattern << ", content: " << str;
errors.push_back(error_str.str());
return false;
}
return true;
}
Another more elegant solution proposed by @Praetorian but somewhat less inefficient (I haven't benchmarked the two versions, thus I'm not sure). Would be to keep the pattern string and pass it as input argument to the function validate_content
and create the regex
object internally, as shown below:
bool validate_content(const std::string & str, const string & pattern, std::vector<std::string> & errors) {
std::regex reg(pattern);
if(false == std::regex_match(str, reg)) {
std::stringstream error_str;
error_str << "Pattern match failure: " << pattern << ", content: " << str;
errors.push_back(error_str.str());
return false;
}
return true;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With