Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove all occurrences of words in a string from a python list

Tags:

python

regex

I'm trying to match and remove all words in a list from a string using a compiled regex but I'm struggling to avoid occurrences within words.

Current:

 REMOVE_LIST = ["a", "an", "as", "at", ...]

 remove = '|'.join(REMOVE_LIST)
 regex = re.compile(r'('+remove+')', flags=re.IGNORECASE)
 out = regex.sub("", text)

In: "The quick brown fox jumped over an ant"

Out: "quick brown fox jumped over t"

Expected: "quick brown fox jumped over"

I've tried changing the string to compile to the following but to no avail:

 regex = re.compile(r'\b('+remove+')\b', flags=re.IGNORECASE)

Any suggestions or am I missing something garishly obvious?

like image 478
Ogre Avatar asked Mar 15 '13 15:03

Ogre


People also ask

How do I remove all occurrences from a list element?

Method 3 : Using remove() In this method, we iterate through each item in the list, and when we find a match for the item to be removed, we will call remove() function on the list.

How do you remove all occurrences of a given character from the input string in Python?

Python Remove Character from String using replace() We can use string replace() function to replace a character with a new character. If we provide an empty string as the second argument, then the character will get removed from the string.

Does list remove remove all occurrences?

The remove() Method Removes the First Occurrence of an Item in a List. A thing to keep in mind when using the remove() method is that it will search for and will remove only the first instance of an item.


1 Answers

here is a suggestion without using regex you may want to consider:

>>> sentence = 'word1 word2 word3 word1 word2 word4'
>>> remove_list = ['word1', 'word2']
>>> word_list = sentence.split()
>>> ' '.join([i for i in word_list if i not in remove_list])
'word3 word4'
like image 189
jurgenreza Avatar answered Oct 26 '22 13:10

jurgenreza