Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing values in a data.frame according to a value from an other data.frame with the same shape (Python)

I have to data.frames df1and df2 and with exact the same size and column names, but different values. df2 has much NaN and df1 only a few. I want every NaN in df2 become 0, if there is any value in df1 at the same place (except NaN).
E.g.:

df1
    a    b   c
0   1    5   NaN
1   2    4   8
2   5    8   5
3   8    8   1
4   7    3   2  
5   NaN  5   1

df2
    a    b   c
0   5    5   NaN
1   NaN  4   8
2   3    8   NaN
3   NaN  NaN 8
4   9    NaN 6  
5   NaN  5   7

The result should look like this.

df2
    a    b   c
0   5    5   NaN
1   0    4   8
2   3    8   0
3   0    0   8
4   9    0   6  
5   NaN  5   7

I am still new to Python and cannot find a solution so far. Unsucsessfully I tried:

for row in range(len(df1)):
    if df1.iloc[row,1:] >= 0:
        df2[row,1:] == 0 
    elif df1.iloc[row,1:] == '':
        df2.iloc[row,1:] == '' 
like image 793
Etiende Avatar asked Apr 28 '21 15:04

Etiende


People also ask

How do you substitute values in a data frame?

Suppose that you want to replace multiple values with multiple new values for an individual DataFrame column. In that case, you may use this template: df['column name'] = df['column name']. replace(['1st old value','2nd old value',...],['1st new value','2nd new value',...])

How do you change a value to another value in Python?

The replace() method can take maximum of 3 parameters: old - old substring you want to replace. new - new substring which will replace the old substring. count (optional) - the number of times you want to replace the old substring with the new substring.

How do you replace a column in a DataFrame with another column?

In order to replace a value in Pandas DataFrame, use the replace() method with the column the from and to values.

How do I replace a value in a Dataframe in Python?

Depending on your needs, you may use either of the following methods to replace values in Pandas DataFrame: (1) Replace a single value with a new value for an individual DataFrame column: df ['column name'] = df ['column name'].replace ( ['old value'],'new value')

How to replace values for multiple columns from another Dataframe?

If you need to replace values for multiple columns from another DataFrame - this is the syntax: The two columns are added from df1 to df2: What will happen if the indexes do not match?

How to replace a value in a data frame in R?

You can use the following syntax to replace a particular value in a data frame in R with a new value: df [df == 'Old Value'] <- 'New value' You can use the following syntax to replace one of several values in a data frame with a new value: df [df == 'Old Value 1' | df == 'Old Value 2'] <- 'New value'

How to replace blue values with green values in Python Dataframe?

Run the code in Python, and you’ll see the following DataFrame: Let’s now replace all the ‘Blue’ values with the ‘Green’ values under the ‘first_set’ column. You may then use the following template to accomplish this goal: And this is the complete Python code for our example:


Video Answer


3 Answers

You can first set the df2 to 0 where df1 is not null, then take np.fmax which ignores NaN when calculating element wise max of 2 arrays:

np.fmax(df2,df2.mask(df1.notna(),0))

EDIT, thanks to @Ben.T for pointing, the above only works with positive values, use the below instead:

df2.fillna(0).where(df1.notna())

     a    b    c
0  5.0  5.0  NaN
1  0.0  4.0  8.0
2  3.0  8.0  0.0
3  0.0  0.0  8.0
4  9.0  0.0  6.0
5  NaN  5.0  7.0
like image 172
anky Avatar answered Oct 20 '22 14:10

anky


Another way to do it is select from df1 where it is NaN with pd.DataFrame.isnull method and substitute it with df2 values, as below:

>>> df1
   a    b    c
0  0  1.0  3.0
1  1  NaN  2.0
2  2  3.0  4.0
>>> df1 = pd.DataFrame({'a': [0, 1, 2], 'b': [1, np.NaN, 3], 'c': [np.NaN, 2, 4]})
>>> df2 = pd.DataFrame({'a': [0, 1, 2], 'b': [1, np.NaN, 3], 'c': [3, 2, 4]})
>>> df1
   a    b    c
0  0  1.0  NaN
1  1  NaN  2.0
2  2  3.0  4.0
>>> df2
   a    b  c
0  0  1.0  3
1  1  NaN  2
2  2  3.0  4
>>> df1[df1.isnull()] = df2
>>> df1
   a    b    c
0  0  1.0  3.0
1  1  NaN  2.0
2  2  3.0  4.0
like image 21
Felipe Whitaker Avatar answered Oct 20 '22 15:10

Felipe Whitaker


You could fill the values in df2 with True or False depending when df1.isna(). Then, you can replace True and False:

df2.fillna(df1.isna()).replace(False,0).replace(True,np.nan)

      a    b    c
0  5.0  5.0  NaN
1  0.0  4.0  8.0
2  3.0  8.0  0.0
3  0.0  0.0  8.0
4  9.0  0.0  6.0
5  NaN  5.0  7.0
like image 1
sophocles Avatar answered Oct 20 '22 15:10

sophocles