Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

REGEXEXTRACT with capturing group

I wonder if there is a way to refer to the group captured in same expression when using REGEXEXTRACT() in Google Sheets?

Let's say we have a sample string: aaa123bbb123ccc456ddd123eee123fff456ggg

and we'd like to extract the part where some 3 digits occure at least 3 times. Normally I would use regex like this: (\d{3})(?:[^\1]*\1){2,}

but how to refer to the first group in =REGEXEXTRACT(A1;"(\d{3})(?:[^\1]*\1){2,}")? This one returns error in Sheets.

like image 771
friedman Avatar asked Jan 30 '18 11:01

friedman


People also ask

What does non capturing group mean?

Non-capturing groups are important constructs within Java Regular Expressions. They create a sub-pattern that functions as a single unit but does not save the matched character sequence.

What is capturing group in regex Javascript?

Groups group multiple patterns as a whole, and capturing groups provide extra submatch information when using a regular expression pattern to match against a string. Backreferences refer to a previously captured group in the same regular expression.

What does capturing mean in regex?

capturing in regexps means indicating that you're interested not only in matching (which is finding strings of characters that match your regular expression), but you're also interested in using specific parts of the matched string later on.


2 Answers

There is no backreference support in RE2 patterns, you need to write a custom JS function to get what you need:

function IS_THREE_DIGIT_REPEATING(input) {
  var rx = /(\d{3})(.*\1){2}/;
  var res = rx.exec(input);
  return res ? res[1] : "No";
}

It will print the contents of the first capturing group in the cell (the 3 digits that are repeating) or No if there is no match.

enter image description here

Pattern details

  • (\d{3}) - Capturing group 1: three digits
  • (.*\1){2} - 2 consecutive occurrences of any 0+ chars other than linebreak chars followed with the same value as captured in Group 1.
like image 55
Wiktor Stribiżew Avatar answered Sep 30 '22 00:09

Wiktor Stribiżew


The way I emulated capture group behavior in google sheets is buy using REGEXEXTRACT Inline with REGEXREPLACE

For example

=REGEXREPLACE(A1, "word$", "special" & REGEXEXTRACT(A1, "word$"))

Explained:

# REGEXREPLACE(my_cell, regex_to_match_word, text & capture_function)
=REGEXREPLACE(
    A1, 
    "word$", 
    "special" & 
        # REGEXEXTRACT("my_cell", "capture_regex")
        REGEXEXTRACT(
            A1, 
            "word$"
        )
)

References

REGEXREPLACE: https://support.google.com/docs/answer/3098245?hl=en

REGEXEXTRACT: https://support.google.com/docs/answer/3098244?hl=en

like image 30
yosefrow Avatar answered Sep 30 '22 02:09

yosefrow