Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get group matches of regular expressions in CMake?

Tags:

regex

cmake

Let's look at the following illustrative example.

set(TEXT "ab,cc,df,gg")
string(REGEX MATCHALL "((.)\\2)" RESULT "${TEXT}")
message("Result: ${RESULT}")  

# Expected:  Result: cc;gg
# Actual:    Result:

Compare the expected result on regex101.

Does anyone know how to retrieve match group 1 correctly in the above example? Is this possible at all with CMake?

I couldn't find much on the limitations of the regular expression processor used by CMake in the web. Who knows more? (There's a little something written about this in CMake FAQ)

Thanks for the support!

like image 892
normanius Avatar asked Apr 24 '14 16:04

normanius


People also ask

How to use named group in regular expression?

Use named group in regular expression. Regex expression = new Regex ( @"Left (?<middle>\d+)Right" ); // ... See if we matched.

What is the use of regex match?

Regex.Match returns a Match object. The Groups property on a Match gets the captured groups within the regular expression. Regex This property is useful for extracting a part of a string from a match. It can be used with multiple captured parts.

How do you match a regular expression with a replacement expression?

Match the <regular_expression> as many times as possible and substitute the <replacement_expression> for the match in the output. All <input> arguments are concatenated before matching. The <replacement_expression> may refer to parenthesis-delimited subexpressions of the match using \1, \2, ..., \9.

How to capture all matches to a regex group in Python?

To capture all matches to a regex group we need to use the finditer() method. The finditer() method finds all matches and returns an iterator yielding match objects matching the regex pattern. Next, we can iterate each Match object and extract its value.


1 Answers

CMake's regular expressions are relatively limited. Look at the static char* regatom (int *flagp) method in RegularExpression.cxx. A \\ indicates that the next character is escaped (treated literally). It looks like there are no back-references possible in the CMake regex.

As a work around, you can invoke shell commands using execute_process.

set(TEXT "ab,cc,df,gg")
message("TEXT: ${TEXT}")

execute_process(
    COMMAND echo ${TEXT}
    COMMAND sed "s/.*\\(\\(.\\)\\2\\).*/\\1/g"
    OUTPUT_VARIABLE RESULT OUTPUT_STRIP_TRAILING_WHITESPACE
    )

message("RESULT: ${RESULT}")  

This produces:

TEXT: ab,cc,df,gg
RESULT: gg

You will have to adjust your regex do produce cc;gg from the given string.

like image 178
Benedikt Köppel Avatar answered Nov 15 '22 00:11

Benedikt Köppel