Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove specific substrings from a set of strings in Python?

People also ask

How do I remove a particular substring from a string?

To remove a substring from a string, call the replace() method, passing it the substring and an empty string as parameters, e.g. str. replace("example", "") . The replace() method will return a new string, where the first occurrence of the supplied substring is removed.

How do you remove certain strings from a list in Python?

In Python, use list methods clear() , pop() , and remove() to remove items (elements) from a list. It is also possible to delete items using del statement by specifying a position or range with an index or slice.

How do I remove a suffix from a string in Python?

There are multiple ways to remove whitespace and other characters from a string in Python. The most commonly known methods are strip() , lstrip() , and rstrip() . Since Python version 3.9, two highly anticipated methods were introduced to remove the prefix or suffix of a string: removeprefix() and removesuffix() .


Strings are immutable. str.replace creates a new string. This is stated in the documentation:

str.replace(old, new[, count])

Return a copy of the string with all occurrences of substring old replaced by new. [...]

This means you have to re-allocate the set or re-populate it (re-allocating is easier with a set comprehension):

new_set = {x.replace('.good', '').replace('.bad', '') for x in set1}

>>> x = 'Pear.good'
>>> y = x.replace('.good','')
>>> y
'Pear'
>>> x
'Pear.good'

.replace doesn't change the string, it returns a copy of the string with the replacement. You can't change the string directly because strings are immutable.

You need to take the return values from x.replace and put them in a new set.


All you need is a bit of black magic!

>>> a = ["cherry.bad","pear.good", "apple.good"]
>>> a = list(map(lambda x: x.replace('.good','').replace('.bad',''),a))
>>> a
['cherry', 'pear', 'apple']

In Python 3.9+ you could remove the suffix using str.removesuffix('mysuffix'). From the docs:

If the string ends with the suffix string and that suffix is not empty, return string[:-len(suffix)]. Otherwise, return a copy of the original string

So you can either create a new empty set and add each element without the suffix to it:

set1  = {'Apple.good', 'Orange.good', 'Pear.bad', 'Pear.good', 'Banana.bad', 'Potato.bad'}

set2 = set()
for s in set1:
   set2.add(s.removesuffix(".good").removesuffix(".bad"))

Or create the new set using a set comprehension:

set2 = {s.removesuffix(".good").removesuffix(".bad") for s in set1}
   
print(set2)

Output:

{'Orange', 'Pear', 'Apple', 'Banana', 'Potato'}