I'm trying to match and remove all words in a list from a string using a compiled regex but I'm struggling to avoid occurrences within words. Current: <pre class="prettyprint"><code> REMOVE_LIST = ["a", "an", "as", "at", ...] remove = '|'.join(REMOVE_LIST) regex = re.compile(r'('+remove+')', flags=re.IGNORECASE) out = regex.sub("", text) </code></pre> In: "The quick brown fox jumped over an ant" Out: "quick brown fox jumped over t" Expected: "quick brown fox jumped over" I've tried changing the string to compile to the following but to no avail: <pre class="prettyprint"><code> regex = re.compile(r'\b('+remove+')\b', flags=re.IGNORECASE) </code></pre> Any suggestions or am I missing something garishly obvious?

here is a suggestion without using regex you may want to consider: <pre class="prettyprint"><code>>>> sentence = 'word1 word2 word3 word1 word2 word4' >>> remove_list = ['word1', 'word2'] >>> word_list = sentence.split() >>> ' '.join([i for i in word_list if i not in remove_list]) 'word3 word4' </code></pre>

Remove all occurrences of words in a string from a python list

Tags:

python

regex

I'm trying to match and remove all words in a list from a string using a compiled regex but I'm struggling to avoid occurrences within words.

Current:

 REMOVE_LIST = ["a", "an", "as", "at", ...]

 remove = '|'.join(REMOVE_LIST)
 regex = re.compile(r'('+remove+')', flags=re.IGNORECASE)
 out = regex.sub("", text)

In: "The quick brown fox jumped over an ant"

Out: "quick brown fox jumped over t"

Expected: "quick brown fox jumped over"

I've tried changing the string to compile to the following but to no avail:

 regex = re.compile(r'\b('+remove+')\b', flags=re.IGNORECASE)

Any suggestions or am I missing something garishly obvious?

478

asked Mar 15 '13 15:03

Ogre

1 Answers

here is a suggestion without using regex you may want to consider:

>>> sentence = 'word1 word2 word3 word1 word2 word4'
>>> remove_list = ['word1', 'word2']
>>> word_list = sentence.split()
>>> ' '.join([i for i in word_list if i not in remove_list])
'word3 word4'

189

answered Oct 26 '22 13:10

jurgenreza

Related questions
                            
                                AttributeError: 'DataFrame' object has no attribute 'to_datetime'
                            
                                Python. How to remove zeroes from a list in Python [duplicate]
                            
                                Unable to import dash in python
                            
                                How to fix "RuntimeError: Missing implementation that supports: loader" when calling hub.text_embedding_column method?
                            
                                Clearing a list
                            
                                Avoid exceptions?
                            
                                Difference in SHA512 between python hashlib and sha512sum tool
                            
                                What is the most efficient way to add an element to a list only if isn't there yet?
                            
                                Python: How do I disallow imports of a class from a module?
                            
                                Including HTML variable in Django template without escaping
                            
                                Can str() fail in Python?
                            
                                How to add with tuples
                            
                                Python - What is the most efficient way to generate padding?
                            
                                filter map vs list comprehension
                            
                                Split string into strings of repeating elements
                            
                                check if a URL to an image is up and exists in Python
                            
                                Django - How does order_by work?
                            
                                Python Divide By Zero Error
                            
                                How to compare two adjacent items in the same list - Python
                            
                                How to response ajax request in Django

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With