Conditional Replace Pandas

People also ask

How do I replace column values based on conditions in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

How do I change a specific value in pandas?

Pandas DataFrame replace() MethodThe replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.

How do I replace multiple columns in pandas?

To replace multiple values in a DataFrame we can apply the method DataFrame. replace(). In Pandas DataFrame replace method is used to replace values within a dataframe object.

.ix indexer works okay for pandas version prior to 0.20.0, but since pandas 0.20.0, the .ix indexer is deprecated, so you should avoid using it. Instead, you can use .loc or iloc indexers. You can solve this problem by:

mask = df.my_channel > 20000
column_name = 'my_channel'
df.loc[mask, column_name] = 0

Or, in one line,

df.loc[df.my_channel > 20000, 'my_channel'] = 0

mask helps you to select the rows in which df.my_channel > 20000 is True, while df.loc[mask, column_name] = 0 sets the value 0 to the selected rows where maskholds in the column which name is column_name.

Update: In this case, you should use loc because if you use iloc, you will get a NotImplementedError telling you that iLocation based boolean indexing on an integer type is not available.

Try

df.loc[df.my_channel > 20000, 'my_channel'] = 0

Note: Since v0.20.0, ix has been deprecated in favour of loc / iloc.

np.where function works as follows:

df['X'] = np.where(df['Y']>=50, 'yes', 'no')

In your case you would want:

import numpy as np
df['my_channel'] = np.where(df.my_channel > 20000, 0, df.my_channel)

The reason your original dataframe does not update is because chained indexing may cause you to modify a copy rather than a view of your dataframe. The docs give this advice:

When setting values in a pandas object, care must be taken to avoid what is called chained indexing.

You have a few alternatives:-

`loc` + Boolean indexing

loc may be used for setting values and supports Boolean masks:

df.loc[df['my_channel'] > 20000, 'my_channel'] = 0

`mask` + Boolean indexing

You can assign to your series:

df['my_channel'] = df['my_channel'].mask(df['my_channel'] > 20000, 0)

Or you can update your series in place:

df['my_channel'].mask(df['my_channel'] > 20000, 0, inplace=True)

`np.where` + Boolean indexing

You can use NumPy by assigning your original series when your condition is not satisfied; however, the first two solutions are cleaner since they explicitly change only specified values.

df['my_channel'] = np.where(df['my_channel'] > 20000, 0, df['my_channel'])

I would use lambda function on a Series of a DataFrame like this:

f = lambda x: 0 if x>100 else 1
df['my_column'] = df['my_column'].map(f)

I do not assert that this is an efficient way, but it works fine.

Related questions
                            
                                Convert unix time to readable date in pandas dataframe
                            
                                Why does the floating-point value of 4*0.1 look nice in Python 3 but 3*0.1 doesn't?
                            
                                retrieve links from web page using python and BeautifulSoup [closed]
                            
                                append multiple values for one key in a dictionary [duplicate]
                            
                                scipy.misc module has no attribute imread?
                            
                                Simple way to encode a string according to a password?
                            
                                How to perform element-wise multiplication of two lists?
                            
                                Format y axis as percent
                            
                                Is there a builtin identity function in python?
                            
                                How to avoid having class data shared among instances?
                            
                                How does tf.app.run() work?
                            
                                Choosing Java vs Python on Google App Engine
                            
                                What exactly is file.flush() doing?
                            
                                "Fire and forget" python async/await
                            
                                What is the EAFP principle in Python?
                            
                                How can I use a Python script in the command line without cd-ing to its directory? Is it the PYTHONPATH?
                            
                                anaconda/conda - install a specific package version
                            
                                NameError: global name 'unicode' is not defined - in Python 3
                            
                                Is it better to use path() or url() in urls.py for django 2.0?
                            
                                Matplotlib transparent line plots

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Conditional Replace Pandas

Tags:

python

replace

pandas

conditional-statements

series

People also ask

`loc` + Boolean indexing

`mask` + Boolean indexing

`np.where` + Boolean indexing

Recent Activity

Donate For Us

Conditional Replace Pandas

Tags:

python

replace

pandas

conditional-statements

series

People also ask

loc + Boolean indexing

mask + Boolean indexing

np.where + Boolean indexing

Related questions

Recent Activity

Donate For Us

`loc` + Boolean indexing

`mask` + Boolean indexing

`np.where` + Boolean indexing