Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python re.sub() weirdness

Tags:

python

regex

I'm very new to Python, in fact this is my first script.

I'm struggling with Python's regular expressions. Specifically re.sub()

I have the following code:

variableTest = "192"
test = re.sub(r'(\$\{\d{1,2}\:)example.com(\})', r'\1' + variableTest + r'\2', searchString, re.M )

With this I'm trying to match something like host": "${9:example.com}" within searchString and replace example.com with a server name or IP address.

If variableTest contains an IP, it fails. I get the following error: sre_constants.error: invalid group reference

I've tested it with variableTest equal to "127.0.0.1", "1", "192", "192.168". "127.0.0.1" works while the rest doesn't. If I prepend the others with a letter it also works.

variableTest is a string - verified with type(variableTest)

I'm totally lost as to why this is.

If I remove r'\1' in the replacement string it also works. r'\1' will containt ${\d}:, with \d a number between 1 and 999.

Any help will be greatly appreciated!

like image 483
tone7 Avatar asked Mar 05 '13 14:03

tone7


People also ask

What does re sub () do?

The re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace.

What does re sub does in Python?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string.

What is R IN RE sub?

The r prefix is part of the string syntax. With r , Python doesn't interpret backslash sequences such as \n , \t etc inside the quotes. Without r , you'd have to type each backslash twice in order to pass it to re. sub . r'\]\n'


1 Answers

The problem is that putting an IP in variableTest will result in a replacement string like this:

r'\18.8.8.8\2'

As you can see, the first group reference is to group 18, not group 1. Hence, re complains about the invalid group reference.

In this case, you want to use the \g<n> syntax instead:

r'\g<1>' + variableTest + r'\g<2>'

which produces e.g. r'\g<1>8.8.8.8\g<2>'.

like image 113
nneonneo Avatar answered Sep 23 '22 23:09

nneonneo