Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Possible to use a back reference in a number range?

I want to match a string where a number is equal or higher than a number in a capturing group.

Example:

  • 1x1 = match
  • 1x2 = match
  • 2x1 = no match

In my mind the regex would look something like this (\d)x[\1-9] but this doesn't work. Is it possible to achieve this using regex?

like image 563
Frinsh Avatar asked Oct 18 '22 14:10

Frinsh


1 Answers

As you've discovered, you cannot interpolate a value within a regex because:

Because character classes are determined when the regex is compiled... The only character class regex node type is "hard-coded list of characters" that was built when the regex was compiled (not after it ran part way and figured out what $1 might end up being).

[Source]

Since character classes do not permit backreferences, a backslash followed by a number is repurposed in a character class:

A backslash followed by two or three octal digits is considered an octal number.

[Source]

This obviously isn't what you intended by [\1-9]. But since there's no way to compile a character class until all characters are known, we'll have to find another way.

If we're looking to do this entirely within a regex we can't enumerate all possible combinations, because we'd have to check all the captures to figure out which one matched. For example:

"1x2" =~ m/(?:(0)x(\d)|(1)x([1-9])|(2)x([2-9])|(3)x([3-9])|(4)x([4-9])|(5)x([5-9])|(6)x([6-9])|(7)x([7-9])|(8)x([89])|(9)x(9))/

Will contain "1" in $3 and "2" in $4, but you'd have to search captures 1 to 20 to find if anything was matched each time.


The only way around doing post processing on regex results is to use a regex conditional: (?(A)X) Where A is a conditional and X is the resulting action.

Sadly conditionals are not supported by RE2, but we'll keep going just to demonstrate it can be done.

What you'd want to use for the X is (*F) (or (?!) in Ruby 2+) to force failure: http://www.rexegg.com/regex-tricks.html#fail

What you'd want to use for the A is ?{$1 > $2}, but only Perl will allow you to use code directly in a regex. Perl would allow you to use:

m/(\d)x(\d)(?(?{$1 > $2})(?!))/

[Live Example]

So the answer to your question is: "No, you cannot do this with RE2 which Google Analytics uses, but yes you can do this with a Perl regex."

like image 199
Jonathan Mee Avatar answered Dec 04 '22 05:12

Jonathan Mee