Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Mask on multiple Conditions

In my dataframe I want to substitute every value below 1 and higher than 5 with nan.

This code works

persDf = persDf.mask(persDf < 1000)

and I get every value as an nan but this one does not:

persDf = persDf.mask((persDf < 1) and (persDf > 5))

and I have no idea why this is so. I have checked the man page and different solutions on apparentely similar problems but could not find a solution. Does anyone have have an idea that could help me on this?

like image 882
ruedi Avatar asked May 11 '19 19:05

ruedi


People also ask

How do I use multiple conditions in pandas?

Using Loc to Filter With Multiple Conditions The loc function in pandas can be used to access groups of rows or columns by label. Add each condition you want to be included in the filtered result and concatenate them with the & operator. You'll see our code sample will return a pd. dataframe of our filtered rows.

How replace column values in pandas based on multiple conditions?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

How do I mask data in pandas Python?

Pandas DataFrame mask() MethodThe mask() method replaces the values of the rows where the condition evaluates to True. The mask() method is the opposite of the The where() method.


1 Answers

Use the | operator, because a value cant be < 1 AND > 5:

persDf = persDf.mask((persDf < 1) | (persDf > 5))

Another method would be to use np.where and call that inside pd.DataFrame:

pd.DataFrame(data=np.where((df < 1) | (df > 5), np.NaN, df), 
             columns=df.columns)
like image 99
Erfan Avatar answered Sep 29 '22 05:09

Erfan