Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

re.sub in Python 3.3

I am trying to change the text string from the form of file1 to file01. I am really new to python and can't figure out what should go in 'repl' location when trying to use a pattern. Can anyone give me a hand?

text = 'file1 file2 file3'

x = re.sub(r'file[1-9]',r'file\0\w',text) #I'm not sure what should go in repl.
like image 214
user2243215 Avatar asked May 23 '13 06:05

user2243215


People also ask

What does re sub () do?

re. sub() function is used to replace occurrences of a particular sub-string with another sub-string. This function takes as input the following: The sub-string to replace.

Does re sub replace all occurrences?

sub() Replace matching substrings with a new string for all occurrences, or a specified number.

How do you Resub in Python?

If you want to replace a string that matches a regular expression (regex) instead of perfect match, use the sub() of the re module. In re. sub() , specify a regex pattern in the first argument, a new string in the second, and a string to be processed in the third.

What is count in re sub?

The count argument will set the maximum number of replacements that we want to make inside the string. By default, the count is set to zero, which means the re. sub() method will replace all pattern occurrences in the target string.


2 Answers

You could try this:

>>> import re    
>>> text = 'file1 file2 file3'
>>> x = re.sub(r'file([1-9])',r'file0\1',text)
'file01 file02 file03'

The brackets wrapped around the [1-9] captures the match, and it is the first match. You will see I used it in the replace using \1 meaning the first catch in the match.

Also, if you don't want to add the zero for files with 2 digits or more, you could add [^\d] in the regexp:

x = re.sub(r'file([1-9](\s|$))',r'file0\1',text)

A bit more of a generic solution now that I'm revisiting this answer using str.format() and a lambda expression:

import re
fmt = '{:03d}'                 # Let's say we want 3 digits with leading zeroes
s = 'file1 file2 file3 text40'
result = re.sub(r"([A-Za-z_]+)([0-9]+)", \
                lambda x: x.group(1) + fmt.format(int(x.group(2))), \
                s)
print(result)
# 'file001 file002 file003 text040'

A bit of details about the lambda expression:

lambda x: x.group(1) + fmt.format(int(x.group(2)))
#         ^--------^   ^-^        ^-------------^
#          filename   format     file number ([0-9]+) converted to int
#        ([A-Za-z_]+)            so format() can work with our format

I am using the expression [A-Za-z_]+ assuming the filename contains letters and underscores only besides the training digits. Do pick a more appropriate expression if required.

like image 60
Jerry Avatar answered Sep 20 '22 12:09

Jerry


To match files with single digit on the end, use a word boundary \b:

>>> text = ' '.join('file{}'.format(i) for i in range(12))
>>> text
'file0 file1 file2 file3 file4 file5 file6 file7 file8 file9 file10 file11'
>>> import re
>>> re.sub(r'file(\d)\b',r'file0\1',text)
'file00 file01 file02 file03 file04 file05 file06 file07 file08 file09 file10 file11'
like image 30
Mark Tolonen Avatar answered Sep 16 '22 12:09

Mark Tolonen