Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to re.sub() a optional matching group using regex in Python?

Tags:

python

regex

My problem is quite simple.

I have a URL, sometimes it ends with specific characters. If they are present, I would like to add them to my new URL.

test1 = "url#123"
test2 = "url"

r = re.sub(r"url(#[0-9]+)?", r"new_url\1", test1)
# Expected result: "new_url#123"
# Actual result: "new_url#123"

r = re.sub(r"url(#[0-9]+)?", r"new_url\1", test2)
# Expected result: "new_url"
# Actual result: "error: unmatched group"

Of course, I can not just do re.sub("url", "new_url", test), because for example it could be "url/123" and in this case I do not wish to make amendments.

like image 856
Delgan Avatar asked Jul 01 '14 15:07

Delgan


People also ask

How do I make a group optional in regex Python?

So to make any group optional, we need to have to put a “?” after the pattern or group. This question mark makes the preceding group or pattern optional.

How do you use the RE sub function in Python?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string. To use this function, we need to import the re module first.

How do you replace re subs?

If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.

How do you make an entire group optional in regex?

You can make several tokens optional by grouping them together using parentheses, and placing the question mark after the closing parenthesis.


1 Answers

You cannot use optional matching group in replacement string.

How about following approach?

>>> import re
>>> test1 = "url#123"
>>> test2 = "url"
>>> re.sub(r"url((?:#[0-9]+)?)", r"new_url\1", test1)
new_url#123
>>> re.sub(r"url((?:#[0-9]+)?)", r"new_url\1", test2)
new_url

BTW, if you use regex, you can use optional matching group:

>>> import regex
>>> test1 = "url#123"
>>> test2 = "url"
>>> regex.sub(r"url(#[0-9]+)?", r"new_url\1", test1)
'new_url#123'
>>> regex.sub(r"url(#[0-9]+)?", r"new_url\1", test2)
'new_url'
like image 116
falsetru Avatar answered Oct 26 '22 18:10

falsetru