My problem is quite simple.
I have a URL, sometimes it ends with specific characters. If they are present, I would like to add them to my new URL.
test1 = "url#123"
test2 = "url"
r = re.sub(r"url(#[0-9]+)?", r"new_url\1", test1)
# Expected result: "new_url#123"
# Actual result: "new_url#123"
r = re.sub(r"url(#[0-9]+)?", r"new_url\1", test2)
# Expected result: "new_url"
# Actual result: "error: unmatched group"
Of course, I can not just do re.sub("url", "new_url", test)
, because for example it could be "url/123" and in this case I do not wish to make amendments.
So to make any group optional, we need to have to put a “?” after the pattern or group. This question mark makes the preceding group or pattern optional.
sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string. To use this function, we need to import the re module first.
If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.
You can make several tokens optional by grouping them together using parentheses, and placing the question mark after the closing parenthesis.
You cannot use optional matching group in replacement string.
How about following approach?
>>> import re
>>> test1 = "url#123"
>>> test2 = "url"
>>> re.sub(r"url((?:#[0-9]+)?)", r"new_url\1", test1)
new_url#123
>>> re.sub(r"url((?:#[0-9]+)?)", r"new_url\1", test2)
new_url
BTW, if you use regex
, you can use optional matching group:
>>> import regex
>>> test1 = "url#123"
>>> test2 = "url"
>>> regex.sub(r"url(#[0-9]+)?", r"new_url\1", test1)
'new_url#123'
>>> regex.sub(r"url(#[0-9]+)?", r"new_url\1", test2)
'new_url'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With