Is there a simple way to remove all characters from a given string that match a given regular expression? I know in Ruby I can use <code>gsub</code>: <pre class="prettyprint"><code>>> key = "cd baz ; ls -l" => "cd baz ; ls -l" >> newkey = key.gsub(/[^\w\d]/, "") => "cdbazlsl" </code></pre> What would the equivalent function be in Python?

<pre class="prettyprint"><code>import re re.sub(pattern, '', s) </code></pre> Docs

The answers so far have focused on doing the same thing as your Ruby code, which is exactly the reverse of what you're asking in the English part of your question: the code removes character that DO match, while your text asks for <blockquote> a simple way to remove all characters from a given string that fail to match </blockquote> For example, suppose your RE's pattern was <code>r'\d{2,}'</code>, "two or more digits" -- so the non-matching parts would be all non-digits plus all single, isolated digits. Removing the NON-matching parts, as your text requires, is also easy: <pre class="prettyprint"><code>>>> import re >>> there = re.compile(r'\d{2,}') >>> ''.join(there.findall('123foo7bah45xx9za678')) '12345678' </code></pre> Edit: OK, OP's clarified the question now (he did indeed mean what his code, not his text, said, and now the text is right too;-) but I'm leaving the answer in for completeness (the other answers suggesting <code>re.sub</code> are correct for the question as it now stands). I realize you probably mean what you "say" in your Ruby code, and not what you say in your English text, but, just in case, I thought I'd better complete the set of answers!-)

Python - Use a Regex to Filter Data

Tags:

python

regex

Is there a simple way to remove all characters from a given string that match a given regular expression? I know in Ruby I can use gsub:

Click to copy

>> key = "cd baz ; ls -l"
=> "cd baz ; ls -l"
>> newkey = key.gsub(/[^\w\d]/, "")
=> "cdbazlsl"

What would the equivalent function be in Python?

996

asked Aug 16 '09 17:08

Chris Bunch

2 Answers

Click to copy

import re
re.sub(pattern, '', s)

Docs

163

answered Oct 03 '22 23:10

SilentGhost

The answers so far have focused on doing the same thing as your Ruby code, which is exactly the reverse of what you're asking in the English part of your question: the code removes character that DO match, while your text asks for

a simple way to remove all characters from a given string that fail to match

For example, suppose your RE's pattern was r'\d{2,}', "two or more digits" -- so the non-matching parts would be all non-digits plus all single, isolated digits. Removing the NON-matching parts, as your text requires, is also easy:

Click to copy

>>> import re
>>> there = re.compile(r'\d{2,}')
>>> ''.join(there.findall('123foo7bah45xx9za678'))
'12345678'

Edit: OK, OP's clarified the question now (he did indeed mean what his code, not his text, said, and now the text is right too;-) but I'm leaving the answer in for completeness (the other answers suggesting re.sub are correct for the question as it now stands). I realize you probably mean what you "say" in your Ruby code, and not what you say in your English text, but, just in case, I thought I'd better complete the set of answers!-)

answered Oct 04 '22 00:10

Alex Martelli

Related questions
                            
                                Pyaudio Installation failure on Ubuntu [duplicate]
                            
                                ImportError: No module named pywin32
                            
                                Create a new dataframe based on rows with a certain value
                            
                                GPU Sync Failed While using tensorflow
                            
                                How to import html file into python variable?
                            
                                500 error when Debug=False with Heroku and Django
                            
                                Python Selenium Headless download
                            
                                how to plot candlesticks in python
                            
                                How to properly mock private members of a class
                            
                                Difference between list.pop() and list = list[:-1]
                            
                                AttributeError: module 'keras.backend' has no attribute 'image_dim_ordering'
                            
                                Unable to import 'django.http'pylint(import-error) [closed]
                            
                                My .bash_profile changes didnt take place even after restarting mac terminal, what can i do?
                            
                                UnicodeDecodeError: 'charmap' codec| Error during installation of pip python-stdnum==1.8
                            
                                Make a list from multiple list
                            
                                Running multiple sites from a single Python web framework [duplicate]
                            
                                Python templates for web designers
                            
                                How do you iterate over a tree?
                            
                                How to truncate matrix using NumPy (Python)
                            
                                python: arbitrary order by

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python - Use a Regex to Filter Data

Tags:

python

regex

Chris Bunch

People also ask

2 Answers

SilentGhost

Alex Martelli

Recent Activity

Donate For Us