Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace a specific range of values in a pandas dataframe

I have big data set and there are tons of values which are way over average. For example,

    A         B
1  'H'       10
2  'E'    10000
3  'L'       12
4  'L'        8
5  'O'       11

and I want to set B2 cell as 0 and I tried this,

df['B'] = df['B'].replace([df['B'] > 15], 0)

But didn't get any luck. How can make my data frame like this,

    A         B
1  'H'       10
2  'E'        0
3  'L'       12
4  'L'        8
5  'O'       11

Thank you!

like image 492
jayko03 Avatar asked Sep 12 '17 05:09

jayko03


People also ask

How can I replace multiple values with one value in pandas?

To replace multiple values in a DataFrame we can apply the method DataFrame. replace(). In Pandas DataFrame replace method is used to replace values within a dataframe object.

How do you replace a specific value in a pandas DataFrame?

DataFrame. replace() function is used to replace values in column (one value with another value on all columns). This method takes to_replace, value, inplace, limit, regex and method as parameters and returns a new DataFrame. When inplace=True is used, it replaces on existing DataFrame object and returns None value.

How do you conditionally replace values in pandas?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

How do you replace a value in a DataFrame with another value?

Pandas DataFrame replace() Method The replace() method replaces the specified value with another specified value. The replace() method searches the entire DataFrame and replaces every case of the specified value.


1 Answers

You are really close - instead of replace, use mask:

df['B'] = df['B'].mask(df['B'] > 15, 0)
print (df)
     A   B
1  'H'  10
2  'E'   0
3  'L'  12
4  'L'   8
5  'O'  11

Alternative:

df['B'] = np.where(df['B'] > 15, 0, df['B'])
print (df)
     A   B
1  'H'  10
2  'E'   0
3  'L'  12
4  'L'   8
5  'O'  11

If you want replace some range:

df['B'] = np.where(df['B'].between(8,11), 0, df['B'])
print (df)
     A      B
1  'H'      0
2  'E'  10000
3  'L'     12
4  'L'      0
5  'O'      0
like image 141
jezrael Avatar answered Nov 15 '22 09:11

jezrael