Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas update a dataframe value from another dataframe

I have two dataframes in python. I want to update rows in first dataframe using matching values from another dataframe. Second dataframe serves as an override.

Here is an example with same data and code:

DataFrame 1 :

enter image description here

DataFrame 2:

enter image description here

I want to update update dataframe 1 based on matching code and name. In this example Dataframe 1 should be updated as below:

enter image description here

Note : Row with Code =2 and Name= Company2 is updated with value 1000 (coming from Dataframe 2)

import pandas as pd

data1 = {
         'Code': [1, 2, 3],
         'Name': ['Company1', 'Company2', 'Company3'],
         'Value': [200, 300, 400],

    }
df1 = pd.DataFrame(data1, columns= ['Code','Name','Value'])

data2 = {
         'Code': [2],
         'Name': ['Company2'],
         'Value': [1000],
    }

df2 = pd.DataFrame(data2, columns= ['Code','Name','Value'])

Any pointers or hints?

like image 971
ProgSky Avatar asked Apr 19 '18 19:04

ProgSky


People also ask

How do you assign values from one DataFrame to another in python?

assign() method assign new columns to a DataFrame, returning a new object (a copy) with the new columns added to the original ones. Existing columns that are re-assigned will be overwritten. Length of newly assigned column must match the number of rows in the dataframe.


2 Answers

Using DataFrame.update, which aligns on indices (https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.update.html):

>>> df1.set_index('Code', inplace=True)
>>> df1.update(df2.set_index('Code'))
>>> df1.reset_index()  # to recover the initial structure

   Code      Name   Value
0     1  Company1   200.0
1     2  Company2  1000.0
2     3  Company3   400.0
like image 187
Nic Avatar answered Oct 22 '22 15:10

Nic


You can using concat + drop_duplicates which updates the common rows and adds the new rows in df2

pd.concat([df1,df2]).drop_duplicates(['Code','Name'],keep='last').sort_values('Code')
Out[1280]: 
   Code      Name  Value
0     1  Company1    200
0     2  Company2   1000
2     3  Company3    400

Update due to below comments

df1.set_index(['Code', 'Name'], inplace=True)

df1.update(df2.set_index(['Code', 'Name']))

df1.reset_index(drop=True, inplace=True)
like image 42
BENY Avatar answered Oct 22 '22 16:10

BENY