Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

re.sub() - Regex for replacing last occurance of a substring in a string

I'm trying to replace the last occurrence of a substring from a string using re.sub in Python but stuck with the regex pattern. Can someone help me to get the correct pattern?

String = "cr US TRUMP DE NIRO 20161008cr_x080b.wmv"

or

String = "crcrUS TRUMP DE NIRO 20161008cr.xml"

I want to replace the last occurrence of "cr" and anything before the extension.

desired output strings are -

"cr US TRUMP DE NIRO 20161008.wmv"
"crcrUS TRUMP DE NIRO 20161008.xml"

I'm using re.sub to replace it.

re.sub('pattern', '', String)

Please advise.

like image 646
Sanchit Avatar asked Oct 25 '16 11:10

Sanchit


3 Answers

using a greedy quantifier and a capture group:

re.sub(r'(.*)cr[^.]*', '\\1', input)
like image 83
Casimir et Hippolyte Avatar answered Sep 30 '22 13:09

Casimir et Hippolyte


The alternative solution using str.rfind(sub[, start[, end]]) function:

string = "cr US TRUMP DE NIRO 20161008cr_x080b.wmv"
last_position = string.rfind('cr')
string = string[:last_position] + string[string.rfind('.'):]

print(string)  #cr US TRUMP DE NIRO 20161008.wmv

Besides, rfind will go much faster in such case:
here is measurement results:
using str.rfind(...) : 0.0054836273193359375
using re.sub(...)       : 0.4017353057861328

like image 32
RomanPerekhrest Avatar answered Sep 30 '22 11:09

RomanPerekhrest


You can use this negative lookahead regex:

repl = re.sub(r"cr((?!cr)[^.])*(?=\.[^.]+$)", "", input);

RegEx Demo

RegEx Breakup:

cr         # match cr
(?:        # non-capturing group start
   (?!     # negative lookahead start
      cr   # match cr
   )       # negative lookahead end
   [^.]    # match anything but DOT
)          # non-capturing group end
*          # match 0 or more of matching character that doesn't have cr at next postion
(?=        # positive lookahead start
   \.      # match DOT
   [^.]+   # followed by 1 or more anything but DOT
   $       # end of input
)          # postive lookahead end
like image 21
anubhava Avatar answered Sep 30 '22 11:09

anubhava