Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Pandas df.where on multiple columns produces unexpected NaN values

Tags:

python

pandas

Given the DataFrame

import pandas as pd

df = pd.DataFrame({
    'transformed': ['left', 'right', 'left', 'right'],
    'left_f': [1, 2, 3, 4],
    'right_f': [10, 20, 30, 40],
    'left_t': [-1, -2, -3, -4],
    'right_t': [-10, -20, -30, -40],
})

I want to create two new columns, picking from either left_* or right_* depending on the content of transformed:

df['transformed_f'] = df['right_f'].where(
    df['transformed'] == 'right',
    df['left_f']
)

df['transformed_t'] = df['right_t'].where(
    df['transformed'] == 'right',
    df['left_t']
)

And I get the expected result

df
#    transformed  left_f  right_f  left_t  right_t  transformed_f  transformed_t
# 0  left              1       10      -1      -10              1             -1
# 1  right             2       20      -2      -20             20            -20
# 2  left              3       30      -3      -30              3             -3
# 3  right             4       40      -4      -40             40            -40

However when I try to do it in one operation I get an unexpected result containing NaN values

df[['transformed_f', 'transformed_t']] = df[['right_f', 'right_t']].where(
    df['transformed'] == 'right',
    df[['left_f', 'left_t']]
)

df
#    transformed  left_f  right_f  left_t  right_t  transformed_f  transformed_t
# 0  left              1       10      -1      -10            NaN            NaN
# 1  right             2       20      -2      -20           20.0          -20.0
# 2  left              3       30      -3      -30            NaN            NaN
# 3  right             4       40      -4      -40           40.0          -40.0

Is there a way to use df.where() on multiple columns at once?

like image 482
Nils Werner Avatar asked Jun 27 '19 12:06

Nils Werner


People also ask

How to find all columns with NaN values in pandas Dataframe?

Here are 4 ways to find all columns that contain NaN values in Pandas DataFrame: (1) Use isna () to find all columns with NaN values: (2) Use isnull () to find all columns with NaN values: (3) Use isna () to select all columns with NaN values: (4) Use isnull () to select all columns with NaN values:

How to fill in missing values in a pandas Dataframe?

Software Tutorials The pandas fillna () function is useful for filling in missing values in columns of a pandas DataFrame. This tutorial provides several examples of how to use this function to fill in missing values for multiple columns of the following pandas DataFrame:

How to select rows in a pandas Dataframe based on column values?

You can use one of the following methods to select rows in a pandas DataFrame based on column values: df.loc[df ['col1'].isin( [value1, value2, value3, ...])] The following example shows how to use each method with the following pandas DataFrame:

How do I select all columns with NaN values in Python?

(3) Use isna () to select all columns with NaN values: (4) Use isnull () to select all columns with NaN values: In the next section, you’ll see how to apply the above approaches in practice. For example, let’s create a DataFrame with 4 columns: Notice that some of the columns in the DataFrame contain NaN values:


1 Answers

You are close , just add.values or .to_numpy() with the slice to make it an NDarray:

Per docs:

other : scalar, NDFrame, or callable Entries where cond is False are replaced with corresponding value from other. If other is callable, it is computed on the NDFrame and should return scalar or NDFrame. The callable must not change input NDFrame (though pandas doesn’t check it).

So when you directly input the slice of the dataframe, the indexes(col names) dont match and hence it doesn't update the df, when you pass .values , it ignores the indexes and add the values.

df[['transformed_f', 'transformed_t']]=(df[['right_f', 'right_t']].
                        where(df['transformed'] == 'right',df[['left_f', 'left_t']].values))
print(df)

  transformed  left_f  right_f  left_t  right_t  transformed_f  transformed_t
0        left       1       10      -1      -10              1             -1
1       right       2       20      -2      -20             20            -20
2        left       3       30      -3      -30              3             -3
3       right       4       40      -4      -40             40            -40
like image 155
anky Avatar answered Oct 26 '22 23:10

anky