I have big data set and there are tons of values which are way over average. For example,
A B
1 'H' 10
2 'E' 10000
3 'L' 12
4 'L' 8
5 'O' 11
and I want to set B2
cell as 0 and I tried this,
df['B'] = df['B'].replace([df['B'] > 15], 0)
But didn't get any luck. How can make my data frame like this,
A B
1 'H' 10
2 'E' 0
3 'L' 12
4 'L' 8
5 'O' 11
Thank you!
To replace multiple values in a DataFrame we can apply the method DataFrame. replace(). In Pandas DataFrame replace method is used to replace values within a dataframe object.
DataFrame. replace() function is used to replace values in column (one value with another value on all columns). This method takes to_replace, value, inplace, limit, regex and method as parameters and returns a new DataFrame. When inplace=True is used, it replaces on existing DataFrame object and returns None value.
You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.
Pandas DataFrame replace() Method The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.
You are really close - instead of replace
, use mask
:
df['B'] = df['B'].mask(df['B'] > 15, 0)
print (df)
A B
1 'H' 10
2 'E' 0
3 'L' 12
4 'L' 8
5 'O' 11
Alternative:
df['B'] = np.where(df['B'] > 15, 0, df['B'])
print (df)
A B
1 'H' 10
2 'E' 0
3 'L' 12
4 'L' 8
5 'O' 11
If you want replace some range:
df['B'] = np.where(df['B'].between(8,11), 0, df['B'])
print (df)
A B
1 'H' 0
2 'E' 10000
3 'L' 12
4 'L' 0
5 'O' 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With