Let's look at the following illustrative example.
set(TEXT "ab,cc,df,gg")
string(REGEX MATCHALL "((.)\\2)" RESULT "${TEXT}")
message("Result: ${RESULT}")
# Expected: Result: cc;gg
# Actual: Result:
Compare the expected result on regex101.
Does anyone know how to retrieve match group 1 correctly in the above example? Is this possible at all with CMake?
I couldn't find much on the limitations of the regular expression processor used by CMake in the web. Who knows more? (There's a little something written about this in CMake FAQ)
Thanks for the support!
Use named group in regular expression. Regex expression = new Regex ( @"Left (?<middle>\d+)Right" ); // ... See if we matched.
Regex.Match returns a Match object. The Groups property on a Match gets the captured groups within the regular expression. Regex This property is useful for extracting a part of a string from a match. It can be used with multiple captured parts.
Match the <regular_expression> as many times as possible and substitute the <replacement_expression> for the match in the output. All <input> arguments are concatenated before matching. The <replacement_expression> may refer to parenthesis-delimited subexpressions of the match using \1, \2, ..., \9.
To capture all matches to a regex group we need to use the finditer() method. The finditer() method finds all matches and returns an iterator yielding match objects matching the regex pattern. Next, we can iterate each Match object and extract its value.
CMake's regular expressions are relatively limited. Look at the static char* regatom (int *flagp)
method in RegularExpression.cxx. A \\
indicates that the next character is escaped (treated literally). It looks like there are no back-references possible in the CMake regex.
As a work around, you can invoke shell commands using execute_process
.
set(TEXT "ab,cc,df,gg")
message("TEXT: ${TEXT}")
execute_process(
COMMAND echo ${TEXT}
COMMAND sed "s/.*\\(\\(.\\)\\2\\).*/\\1/g"
OUTPUT_VARIABLE RESULT OUTPUT_STRIP_TRAILING_WHITESPACE
)
message("RESULT: ${RESULT}")
This produces:
TEXT: ab,cc,df,gg
RESULT: gg
You will have to adjust your regex do produce cc;gg
from the given string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With