Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace strings in a list (using re.sub)

Tags:

python

I am trying to replace parts of file extensions in a list of files. I would like to be able to loop through items (files), and remove the extensions. I don't know how to appropriately loop through items in the list when re.sub as the third parameter requires a string. eg. re.sub(pattern, repl, string, count=0, flags=0)

import re

file_lst = ['cats1.fa', 'cats2.fa', 'dog1.fa', 'dog2.fa']
file_lst_trimmed =[]

for file in file_lst:
    file_lst_trimmed = re.sub(r'1.fa', '', file)

The issue arising here is that re.sub expects a string and I want it to loop through a list of strings.

Thanks for any advice!

like image 512
Graeme Avatar asked Nov 27 '17 18:11

Graeme


People also ask

How do I replace a string in a list?

Replace a specific string in a list. If you want to replace the string of elements of a list, use the string method replace() for each element with the list comprehension. If there is no string to be replaced, applying replace() will not change it, so you don't need to select an element with if condition .

What does re sub () do?

sub() function belongs to the Regular Expressions ( re ) module in Python. It returns a string where all matching occurrences of the specified pattern are replaced by the replace string.

Does re sub replace all occurrences?

By default, the count is set to zero, which means the re. sub() method will replace all pattern occurrences in the target string.

Can I use regex in replace Python?

Regex can be used to perform various tasks in Python. It is used to do a search and replace operations, replace patterns in text, check if a string contains the specific pattern.


2 Answers

You can use a list comprehension to construct the new list with the cleaned up files names. \d is the regex to match a single character and $ only matches at the end of the string.

file_lst_trimmed = [re.sub(r'\d\.fa$', '', file) for file in file_lst]

The results:

>>> file_lst_trimmed 
['cats', 'cats', 'dog', 'dog']
like image 81
James Avatar answered Oct 20 '22 07:10

James


You can try this:

import re
file_lst = ['cats1.fa', 'cats2.fa', 'dog1.fa', 'dog2.fa']
final_list = [re.sub('\d+\.\w+$', '', i) for i in file_lst]

Output:

['cats', 'cats', 'dog', 'dog']
like image 3
Ajax1234 Avatar answered Oct 20 '22 07:10

Ajax1234