I want to replace all strings that contain a specific substring. So for example if I have this dataframe: <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'], 'sport': ['tennis', 'football', 'basketball']}) </code></pre> I could replace football with the string 'ball sport' like this: <pre class="prettyprint"><code>df.replace({'sport': {'football': 'ball sport'}}) </code></pre> What I want though is to replace everything that contains <code>ball</code> (in this case <code>football</code> and <code>basketball</code>) with 'ball sport'. Something like this: <pre class="prettyprint"><code>df.replace({'sport': {'[strings that contain ball]': 'ball sport'}}) </code></pre>

You can use <code>apply</code> with a lambda. The <code>x</code> parameter of the lambda function will be each value in the 'sport' column: <pre class="prettyprint"><code>df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x) </code></pre>

Replace whole string if it contains substring in pandas

Tags:

python

pandas

I want to replace all strings that contain a specific substring. So for example if I have this dataframe:

import pandas as pd df = pd.DataFrame({'name': ['Bob', 'Jane', 'Alice'],                     'sport': ['tennis', 'football', 'basketball']})

I could replace football with the string 'ball sport' like this:

df.replace({'sport': {'football': 'ball sport'}})

What I want though is to replace everything that contains ball (in this case football and basketball) with 'ball sport'. Something like this:

df.replace({'sport': {'[strings that contain ball]': 'ball sport'}})

766

asked Sep 29 '16 11:09

nicofilliol

2 Answers

You can use str.contains to mask the rows that contain 'ball' and then overwrite with the new value:

In [71]: df.loc[df['sport'].str.contains('ball'), 'sport'] = 'ball sport' df  Out[71]:     name       sport 0    Bob      tennis 1   Jane  ball sport 2  Alice  ball sport

To make it case-insensitive pass `case=False:

df.loc[df['sport'].str.contains('ball', case=False), 'sport'] = 'ball sport'

174

answered Oct 06 '22 13:10

EdChum

You can use apply with a lambda. The x parameter of the lambda function will be each value in the 'sport' column:

df.sport = df.sport.apply(lambda x: 'ball sport' if 'ball' in x else x)

answered Oct 06 '22 13:10

DeepSpace

Related questions
                            
                                How do I do Debian packaging of a Python package?
                            
                                Add quotes to every list element
                            
                                python - os.getenv and os.environ don't see environment variables of my bash shell
                            
                                In Python, how can I put a thread to sleep until a specific time?
                            
                                Method Not Allowed flask error 405
                            
                                No module named 'virtualenvwrapper'
                            
                                Get all modules/packages used by a python project
                            
                                Exporting Data from google colab to local machine
                            
                                Mini-languages in Python
                            
                                The inheritance of attributes using __init__
                            
                                Adding 'install_requires' to setup.py when making a python package
                            
                                Generating a dense matrix from a sparse matrix in numpy python
                            
                                Python saving multiple figures into one PDF file
                            
                                Matrix from Python to MATLAB
                            
                                BeautifulSoup: just get inside of a tag, no matter how many enclosing tags there are
                            
                                SQLAlchemy or psycopg2?
                            
                                Using openCV to overlay transparent image onto another image
                            
                                Python: Regular expression to match alpha-numeric not working?
                            
                                Extract / Identify Tables from PDF python [closed]
                            
                                Save base64 image in django file field

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With