I can't get the $ (dollar-sign) to work as documented in C++11 regular expressions. This is with ECMAScript syntax (the default).
Example (regex.cc):
#include <iostream>
#include <regex>
int main() {
if ( std::regex_search("one\ntwo", std::regex{"one$"}) ) {
std::cout << "Should match, doesn't." << std::endl;
}
if ( std::regex_search("one\ntwo", std::regex{"two$"}
, std::regex_constants::match_not_eol) ) {
std::cout << "Shouldn't match, does." << std::endl;
}
return 0;
}
Expected output: Should match, doesn't.
Actual output: Shouldn't match, does.
From http://www.cplusplus.com/reference/regex/ECMAScript/:
$
- End of line - Either it is the end of the target sequence, or precedes a line terminator.
From http://www.cplusplus.com/reference/regex/regex_search/:
match_not_eol
- Not End-Of-Line - The last character is not considered an end of line ("$"
does not match).
Tested with Clang 3.3 and 3.4 on FreeBSD 10:
clang++ -std=c++11 -stdlib=libc++ -o regex regex.cc && ./regex
What am I missing?
End of String or Line: $ The $ anchor specifies that the preceding pattern must occur at the end of the input string, or before \n at the end of the input string. If you use $ with the RegexOptions. Multiline option, the match can also occur at the end of a line.
$ means "Match the end of the string" (the position after the last character in the string). Both are called anchors and ensure that the entire string is matched instead of just a substring.
[] denotes a character class. () denotes a capturing group. [a-z0-9] -- One character that is in the range of a-z OR 0-9. (a-z0-9) -- Explicit capture of a-z0-9 .
Looks like you stumbled on LWG issue 2343
To quote,
If Multiline is true, $ matches just before LineTerminator.
If Multiline is false, $ does not match just before LineTerminator.
[,,,]
Multiline of the existing implementations are as follows:
Multiline=false:
libstdc++ r206594
libc++ r199174
Multiline=true:
Visual Studio Express 2013
boost 1.55
Note: using the current SVN version of libc++
, your first test IS actually matched, so looks like this LWG issue is going to be resolved in Multiline's favor
The second issue (match_not_eol
ignored) looks like a fairly straightforward implementation bug. Boost.regex doesn't match that test case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With