Replacing values in a data.frame according to a value from an other data.frame with the same shape (Python)

Q: How to replace blue values with green values in Python Dataframe?

Run the code in Python, and you’ll see the following DataFrame: Let’s now replace all the ‘Blue’ values with the ‘Green’ values under the ‘first_set’ column. You may then use the following template to accomplish this goal: And this is the complete Python code for our example:

Tags:

python

pandas

dataframe

I have to data.frames df1and df2 and with exact the same size and column names, but different values. df2 has much NaN and df1 only a few. I want every NaN in df2 become 0, if there is any value in df1 at the same place (except NaN).
E.g.:

df1
    a    b   c
0   1    5   NaN
1   2    4   8
2   5    8   5
3   8    8   1
4   7    3   2  
5   NaN  5   1

df2
    a    b   c
0   5    5   NaN
1   NaN  4   8
2   3    8   NaN
3   NaN  NaN 8
4   9    NaN 6  
5   NaN  5   7

The result should look like this.

df2
    a    b   c
0   5    5   NaN
1   0    4   8
2   3    8   0
3   0    0   8
4   9    0   6  
5   NaN  5   7

I am still new to Python and cannot find a solution so far. Unsucsessfully I tried:

for row in range(len(df1)):
    if df1.iloc[row,1:] >= 0:
        df2[row,1:] == 0 
    elif df1.iloc[row,1:] == '':
        df2.iloc[row,1:] == ''

793

asked Apr 28 '21 15:04

Etiende

Video Answer

3 Answers

You can first set the df2 to 0 where df1 is not null, then take np.fmax which ignores NaN when calculating element wise max of 2 arrays:

np.fmax(df2,df2.mask(df1.notna(),0))

EDIT, thanks to @Ben.T for pointing, the above only works with positive values, use the below instead:

df2.fillna(0).where(df1.notna())

     a    b    c
0  5.0  5.0  NaN
1  0.0  4.0  8.0
2  3.0  8.0  0.0
3  0.0  0.0  8.0
4  9.0  0.0  6.0
5  NaN  5.0  7.0

172

answered Oct 20 '22 14:10

anky

Another way to do it is select from df1 where it is NaN with pd.DataFrame.isnull method and substitute it with df2 values, as below:

>>> df1
   a    b    c
0  0  1.0  3.0
1  1  NaN  2.0
2  2  3.0  4.0
>>> df1 = pd.DataFrame({'a': [0, 1, 2], 'b': [1, np.NaN, 3], 'c': [np.NaN, 2, 4]})
>>> df2 = pd.DataFrame({'a': [0, 1, 2], 'b': [1, np.NaN, 3], 'c': [3, 2, 4]})
>>> df1
   a    b    c
0  0  1.0  NaN
1  1  NaN  2.0
2  2  3.0  4.0
>>> df2
   a    b  c
0  0  1.0  3
1  1  NaN  2
2  2  3.0  4
>>> df1[df1.isnull()] = df2
>>> df1
   a    b    c
0  0  1.0  3.0
1  1  NaN  2.0
2  2  3.0  4.0

answered Oct 20 '22 15:10

Felipe Whitaker

You could fill the values in df2 with True or False depending when df1.isna(). Then, you can replace True and False:

df2.fillna(df1.isna()).replace(False,0).replace(True,np.nan)

      a    b    c
0  5.0  5.0  NaN
1  0.0  4.0  8.0
2  3.0  8.0  0.0
3  0.0  0.0  8.0
4  9.0  0.0  6.0
5  NaN  5.0  7.0

answered Oct 20 '22 15:10

sophocles

Related questions
                            
                                How to check if a URL is downloadable in requests
                            
                                Generating list of probabilites
                            
                                Rotate through list of delimiters in join()
                            
                                How to fix discord music bot that stops playing before the song is actually over?
                            
                                Pandas: add new column with count how often the highest score of a day was reached by this person
                            
                                How to compare an array against a list of arrays?
                            
                                Pandas read_excel function ignoring dtype
                            
                                how to prevent Poetry to consider .gitignore
                            
                                StartQueryExecution operation: Unable to verify/create output bucket
                            
                                FastAPI How to fix error walking file system: OSError [Errno 40] Too many levels of symbolic links: '/sys/class/vtconsole/vtcon0/subsystem?
                            
                                RuntimeError: view size is not compatible with input tensor's size and stride (at least one dimension spans across two contiguous subspaces)
                            
                                How to index an array with its indices in numpy?
                            
                                How to split the string by '/' and reform it by the split substrings in a dataframe?
                            
                                In Pytorch, is there a difference between (x<0) and x.lt(0)?
                            
                                You have missing dependencies! # Mandatory: spyder_kernels >=2.0.1,<2.1.0 : 2.0.1 (NOK) [duplicate]
                            
                                Apply heatmap on video with OpenCV and Python
                            
                                Why does this print statement using a Python f-string output double parentheses?
                            
                                Same output in different workers in multiprocessing
                            
                                Explain to me what the big deal with tail call optimization is and why Python needs it
                            
                                What is the purpose of graph collections in TensorFlow?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With