I have to data.frames df1
and df2
and with exact the same size and column names, but different values. df2
has much NaN
and df1
only a few. I want every NaN
in df2
become 0
, if there is any value in df1
at the same place (except NaN
).
E.g.:
df1
a b c
0 1 5 NaN
1 2 4 8
2 5 8 5
3 8 8 1
4 7 3 2
5 NaN 5 1
df2
a b c
0 5 5 NaN
1 NaN 4 8
2 3 8 NaN
3 NaN NaN 8
4 9 NaN 6
5 NaN 5 7
The result should look like this.
df2
a b c
0 5 5 NaN
1 0 4 8
2 3 8 0
3 0 0 8
4 9 0 6
5 NaN 5 7
I am still new to Python and cannot find a solution so far. Unsucsessfully I tried:
for row in range(len(df1)):
if df1.iloc[row,1:] >= 0:
df2[row,1:] == 0
elif df1.iloc[row,1:] == '':
df2.iloc[row,1:] == ''
Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])
The replace() method can take maximum of 3 parameters: old - old substring you want to replace. new - new substring which will replace the old substring. count (optional) - the number of times you want to replace the old substring with the new substring.
In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values.
Depending on your needs, you may use either of the following methods to replace values in Pandas DataFrame: (1) Replace a single value with a new value for an individual DataFrame column: df ['column name'] = df ['column name'].replace ( ['old value'],'new value')
If you need to replace values for multiple columns from another DataFrame - this is the syntax: The two columns are added from df1 to df2: What will happen if the indexes do not match?
You can use the following syntax to replace a particular value in a data frame in R with a new value: df [df == 'Old Value'] <- 'New value' You can use the following syntax to replace one of several values in a data frame with a new value: df [df == 'Old Value 1' | df == 'Old Value 2'] <- 'New value'
Run the code in Python, and you’ll see the following DataFrame: Let’s now replace all the ‘Blue’ values with the ‘Green’ values under the ‘first_set’ column. You may then use the following template to accomplish this goal: And this is the complete Python code for our example:
You can first set the df2 to 0 where df1 is not null, then take np.fmax
which ignores NaN
when calculating element wise max of 2 arrays:
np.fmax(df2,df2.mask(df1.notna(),0))
EDIT, thanks to @Ben.T for pointing, the above only works with positive values, use the below instead:
df2.fillna(0).where(df1.notna())
a b c
0 5.0 5.0 NaN
1 0.0 4.0 8.0
2 3.0 8.0 0.0
3 0.0 0.0 8.0
4 9.0 0.0 6.0
5 NaN 5.0 7.0
Another way to do it is select from df1
where it is NaN
with pd.DataFrame.isnull
method and substitute it with df2
values, as below:
>>> df1
a b c
0 0 1.0 3.0
1 1 NaN 2.0
2 2 3.0 4.0
>>> df1 = pd.DataFrame({'a': [0, 1, 2], 'b': [1, np.NaN, 3], 'c': [np.NaN, 2, 4]})
>>> df2 = pd.DataFrame({'a': [0, 1, 2], 'b': [1, np.NaN, 3], 'c': [3, 2, 4]})
>>> df1
a b c
0 0 1.0 NaN
1 1 NaN 2.0
2 2 3.0 4.0
>>> df2
a b c
0 0 1.0 3
1 1 NaN 2
2 2 3.0 4
>>> df1[df1.isnull()] = df2
>>> df1
a b c
0 0 1.0 3.0
1 1 NaN 2.0
2 2 3.0 4.0
You could fill the values in df2
with True
or False
depending when df1.isna()
. Then, you can replace True and False:
df2.fillna(df1.isna()).replace(False,0).replace(True,np.nan)
a b c
0 5.0 5.0 NaN
1 0.0 4.0 8.0
2 3.0 8.0 0.0
3 0.0 0.0 8.0
4 9.0 0.0 6.0
5 NaN 5.0 7.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With