I tried the following regex:
const static char * regex_string = "([a-zA-Z0-9]+).*";
void find_first(const std::string str);
int main(int argc, char ** argv)
{
find_first("0s7fg9078dfg09d78fg097dsfg7sdg\r\nfdfgdfg");
}
void find_first(const std::string str)
{
std::cout << str << std::endl;
std::regex rgx(regex_string);
std::smatch matcher;
if(std::regex_match(str, matcher, rgx))
{
std::cout << "Found : " << matcher.str(0) << std::endl;
} else {
std::cout << "Not found" << std::endl;
}
}
DEMO
I expected the regex will be completely correct and the group will be found. But it wasn't. Why? How can I match the line-break in c++ regex? In Java it works fine.
The dot in regex usually matches any character other than a newline std::ECMAScript syntax.
.
not newline any character except line terminators (LF, CR, LS, PS).
0s7fg9078dfg09d78fg097dsfg7sdg\r\nfdfgdfg
[a-zA-Z0-9]+ matches until \r ↑___↑ .* would match from here
In many regex flavors there is a dotall flag available to make the dot also match newlines.
If not, there are workarounds in different languages such as [^]
not nothing or [\S\s]
any whitespace or non-whitespace together in a class wich results in any character including \n
regex_string = "([a-zA-Z0-9]+)[\\S\\s]*";
Or use optional line breaks: ([a-zA-Z0-9]+).*(?:\\r?\\n.*)*
or ([a-zA-Z0-9]+)(?:.|\\r?\\n)*
See your updated demo
Update - Another idea worth mentioning: std::regex::extended
A <period> ( '.' ), when used outside a bracket expression, is an ERE that shall match any character in the supported character set except NUL.
std::regex rgx(regex_string, std::regex::extended);
See this demo at tio.run
You may try const static char * regex_string = "((.|\r\n)*)";
I hope It will help you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With