I'm trying to replace the last occurrence of a substring from a string using re.sub in Python but stuck with the regex pattern. Can someone help me to get the correct pattern?
String = "cr US TRUMP DE NIRO 20161008cr_x080b.wmv"
or
String = "crcrUS TRUMP DE NIRO 20161008cr.xml"
I want to replace the last occurrence of "cr
" and anything before the extension.
desired output strings are -
"cr US TRUMP DE NIRO 20161008.wmv"
"crcrUS TRUMP DE NIRO 20161008.xml"
I'm using re.sub
to replace it.
re.sub('pattern', '', String)
Please advise.
using a greedy quantifier and a capture group:
re.sub(r'(.*)cr[^.]*', '\\1', input)
The alternative solution using str.rfind(sub[, start[, end]])
function:
string = "cr US TRUMP DE NIRO 20161008cr_x080b.wmv"
last_position = string.rfind('cr')
string = string[:last_position] + string[string.rfind('.'):]
print(string) #cr US TRUMP DE NIRO 20161008.wmv
Besides, rfind
will go much faster in such case:
here is measurement results:
using str.rfind(...)
: 0.0054836273193359375
using re.sub(...)
: 0.4017353057861328
You can use this negative lookahead regex:
repl = re.sub(r"cr((?!cr)[^.])*(?=\.[^.]+$)", "", input);
RegEx Demo
RegEx Breakup:
cr # match cr
(?: # non-capturing group start
(?! # negative lookahead start
cr # match cr
) # negative lookahead end
[^.] # match anything but DOT
) # non-capturing group end
* # match 0 or more of matching character that doesn't have cr at next postion
(?= # positive lookahead start
\. # match DOT
[^.]+ # followed by 1 or more anything but DOT
$ # end of input
) # postive lookahead end
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With