Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to replace only part of the match with python re.sub

Tags:

python

regex

I need to match two cases by one reg expression and do replacement

'long.file.name.jpg' -> 'long.file.name_suff.jpg'

'long.file.name_a.jpg' -> 'long.file.name_suff.jpg'

I'm trying to do the following

re.sub('(\_a)?\.[^\.]*$' , '_suff.',"long.file.name.jpg") 

But this is cut the extension '.jpg' and I'm getting

long.file.name_suff. instead of long.file.name_suff.jpg I understand that this is because of [^.]*$ part, but I can't exclude it, because I have to find last occurance of '_a' to replace or last '.'

Is there a way to replace only part of the match?

like image 974
Arty Avatar asked May 04 '10 08:05

Arty


People also ask

How do you replace a section of a string in regex?

The \[[^\]]*]\[ matches [ , then any 0+ chars other than ] and then ][ . The (...) forms a capturing group #1, it will remember the value that you will be able to get into the replacement with $1 backreference. [^\]]* matches 0+ chars other than ] and this will be replaced.

How do you replace a matching string in Python?

If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.

Does re sub replace all instances?

By default, the count is set to zero, which means the re. sub() method will replace all pattern occurrences in the target string.

What is sub replace in Python?

Python String | replace() replace() is an inbuilt function in the Python programming language that returns a copy of the string where all occurrences of a substring are replaced with another substring.


2 Answers

Put a capture group around the part that you want to preserve, and then include a reference to that capture group within your replacement text.

re.sub(r'(\_a)?\.([^\.]*)$' , r'_suff.\2',"long.file.name.jpg") 
like image 119
Amber Avatar answered Sep 21 '22 22:09

Amber


 re.sub(r'(?:_a)?\.([^.]*)$', r'_suff.\1', "long.file.name.jpg") 

?: starts a non matching group (SO answer), so (?:_a) is matching the _a but not enumerating it, the following question mark makes it optional.

So in English, this says, match the ending .<anything> that follows (or doesn't) the pattern _a

Another way to do this would be to use a lookbehind (see here). Mentioning this because they're super useful, but I didn't know of them for 15 years of doing REs

like image 22
Amarghosh Avatar answered Sep 20 '22 22:09

Amarghosh