Given the DataFrame
import pandas as pd
df = pd.DataFrame({
'transformed': ['left', 'right', 'left', 'right'],
'left_f': [1, 2, 3, 4],
'right_f': [10, 20, 30, 40],
'left_t': [-1, -2, -3, -4],
'right_t': [-10, -20, -30, -40],
})
I want to create two new columns, picking from either left_*
or right_*
depending on the content of transformed
:
df['transformed_f'] = df['right_f'].where(
df['transformed'] == 'right',
df['left_f']
)
df['transformed_t'] = df['right_t'].where(
df['transformed'] == 'right',
df['left_t']
)
And I get the expected result
df
# transformed left_f right_f left_t right_t transformed_f transformed_t
# 0 left 1 10 -1 -10 1 -1
# 1 right 2 20 -2 -20 20 -20
# 2 left 3 30 -3 -30 3 -3
# 3 right 4 40 -4 -40 40 -40
However when I try to do it in one operation I get an unexpected result containing NaN
values
df[['transformed_f', 'transformed_t']] = df[['right_f', 'right_t']].where(
df['transformed'] == 'right',
df[['left_f', 'left_t']]
)
df
# transformed left_f right_f left_t right_t transformed_f transformed_t
# 0 left 1 10 -1 -10 NaN NaN
# 1 right 2 20 -2 -20 20.0 -20.0
# 2 left 3 30 -3 -30 NaN NaN
# 3 right 4 40 -4 -40 40.0 -40.0
Is there a way to use df.where()
on multiple columns at once?
Here are 4 ways to find all columns that contain NaN values in Pandas DataFrame: (1) Use isna () to find all columns with NaN values: (2) Use isnull () to find all columns with NaN values: (3) Use isna () to select all columns with NaN values: (4) Use isnull () to select all columns with NaN values:
Software Tutorials The pandas fillna () function is useful for filling in missing values in columns of a pandas DataFrame. This tutorial provides several examples of how to use this function to fill in missing values for multiple columns of the following pandas DataFrame:
You can use one of the following methods to select rows in a pandas DataFrame based on column values: df.loc[df ['col1'].isin( [value1, value2, value3, ...])] The following example shows how to use each method with the following pandas DataFrame:
(3) Use isna () to select all columns with NaN values: (4) Use isnull () to select all columns with NaN values: In the next section, you’ll see how to apply the above approaches in practice. For example, let’s create a DataFrame with 4 columns: Notice that some of the columns in the DataFrame contain NaN values:
You are close , just add.values
or .to_numpy()
with the slice to make it an NDarray
:
Per docs:
other : scalar, NDFrame, or callable Entries where cond is False are replaced with corresponding value from other. If other is callable, it is computed on the NDFrame and should return scalar or NDFrame. The callable must not change input NDFrame (though pandas doesn’t check it).
So when you directly input the slice of the dataframe, the indexes(col names) dont match and hence it doesn't update the df, when you pass .values
, it ignores the indexes and add the values.
df[['transformed_f', 'transformed_t']]=(df[['right_f', 'right_t']].
where(df['transformed'] == 'right',df[['left_f', 'left_t']].values))
print(df)
transformed left_f right_f left_t right_t transformed_f transformed_t
0 left 1 10 -1 -10 1 -1
1 right 2 20 -2 -20 20 -20
2 left 3 30 -3 -30 3 -3
3 right 4 40 -4 -40 40 -40
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With