Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to set values to np.nan with multiple conditions for series?

Tags:

python

pandas

Let's say I have a dataframe:

    A   B   C   D   E   F
0   x   R   i   R   nan h
1   z   g   j   x a   nan
2   z   h   nan y nan nan
3   x   g   nan nan nan nan
4   x   x   h   x   s   f

I want to replace all the cells where:

  1. the value in row 0 is R (df.loc[0] == 'R')
  2. the cell is not 'x' (!= 'x')
  3. only rows 2 and below (2:)

with np.nan.

Essentially I want to do:

df.loc[2:,df.loc[0]=='R']!='x' = np.nan

I get the error:

SyntaxError: can't assign to comparison

I just don't know how the syntax is supposed to be.

I've tried

df[df.loc[2:,df.loc[0]=='R']!='x']

but this doesn't list the values I want.

like image 396
Mitch Avatar asked Mar 25 '21 16:03

Mitch


People also ask

How do you fill NaN values in series?

The fillna() function is used to fill NA/NaN values using the specified method. Value to use to fill holes (e.g. 0), alternately a dict/Series/DataFrame of values specifying which value to use for each index (for a Series) or column (for a DataFrame). Values not in the dict/Series/DataFrame will not be filled.

How do I change values in a Pandas DataFrame column based on multiple conditions in Python?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.

Which function would replace all Na NaN values of a series with the mean?

You can use the fillna() function to replace NaN values in a pandas DataFrame.

What is the data type of NaN value in a series?

In computing, NaN (/næn/), standing for Not a Number, is a member of a numeric data type that can be interpreted as a value that is undefined or unrepresentable, especially in floating-point arithmetic.


2 Answers

Solution

mask = df.ne('x') & df.iloc[0].eq('R')
mask.iloc[:2] = False

df.mask(mask)

   A    B    C    D    E    F
0  x    R    i    R  NaN    h
1  z    g    j    x    a  NaN
2  z  NaN  NaN  NaN  NaN  NaN
3  x  NaN  NaN  NaN  NaN  NaN
4  x    x    h    x    s    f

Explanation

Build the mask up

  1. df.ne('x') gives

            A      B     C      D     E     F
     0  False   True  True   True  True  True
     1   True   True  True  False  True  True
     2   True   True  True   True  True  True
     3  False   True  True   True  True  True
     4  False  False  True  False  True  True
    
  2. But we want that in conjunction with df.iloc[0].eq('R') which is a Series. Turns out that if we just & those two together, it will align the Series index with the columns of the mask in step 1.

     A    False
     B     True
     C    False
     D     True
     E    False
     F    False
     Name: 0, dtype: bool
    
     # &
    
            A      B     C      D     E     F
     0  False   True  True   True  True  True
     1   True   True  True  False  True  True
     2   True   True  True   True  True  True
     3  False   True  True   True  True  True
     4  False  False  True  False  True  True
    
     # GIVES YOU
    
            A      B      C      D      E      F
     0  False   True  False   True  False  False
     1  False   True  False  False  False  False
     2  False   True  False   True  False  False
     3  False   True  False   True  False  False
     4  False  False  False  False  False  False
    
  3. Finally, we want to exclude the first two rows from these shenanigans so...

     mask.iloc[:2] = False
    
like image 177
piRSquared Avatar answered Oct 23 '22 13:10

piRSquared


Try with:

mask = df.iloc[0] !='R'

df.loc[2:, mask] = df.loc[2:,mask].where(df.loc[2:,mask]=='x')

Output:

     A  B    C    D    E    F
0    x  R    i    R  NaN    h
1    z  g    j    x    a  NaN
2  NaN  h  NaN    y  NaN  NaN
3    x  g  NaN  NaN  NaN  NaN
4    x  x  NaN    x  NaN  NaN
like image 33
Quang Hoang Avatar answered Oct 23 '22 13:10

Quang Hoang