I need a regex that captures an argument between parentheses. The blanks before and after the argument should not be captured. For example, "( ab & c )"
should return "ab & c"
. The argument can be enclosed into single quotes if leading or trailing blanks are needed. So, "( ' ab & c ' )"
should return " ab & c "
.
wstring String = L"( ' ab & c ' )";
wsmatch Matches;
regex_match( String, Matches, wregex(L"\\(\\s*(?:'(.+)'|(.+?))\\s*\\)") );
wcout << L"<" + Matches[1].str() + L"> " + L"<" + Matches[2].str() + L">" + L"\n";
// Results in "<> < ' ab & c '>", not OK
It seems that the second alternative matched, but it also took the space in front of the first quote! It should have been caught by the \s
after the opening parenthesis.
Removing the second alternative:
regex_match( String, Matches, wregex(L"\\(\\s*(?:'(.+)')\\s*\\)") );
wcout << L"<" + Matches[1].str() + L">" + L"\n";
// Results in "< ab & c >", OK
Making it a capturing group of alternatives:
regex_match( String, Matches, wregex(L"\\(\\s*('(.+)'|(.+?))\\s*\\)") );
wcout << L"<" + Matches[1].str() + L"> " + L"<" + Matches[2].str() + L"> " + L"<" + Matches[3].str() + L">" + L"\n";
// Results in "<' ab & c '> < ab & c > <> ", OK
Am I overlooking anything?
Here is my suggestion that merges two alternatives into 1:
wstring String = L"( ' ab & c ' )";
wsmatch Matches;
regex_match( String, Matches, wregex(L"\\(\\s*(')?([^']+)\\1\\s*\\)") );
wcout << L"<" + Matches[2].str() + L"> " + L"\n";
The \(\s*(')?([^']+)\1\s*\)
regex is using a back-reference to make sure we have a '
at the beginning and the end in order not to capture 'something
. The value is caught into Group 2.
Output:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With