Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas merging dataframes in a loop [duplicate]

I created a loop to read sqlite database into a pandas dataframe, and I am trying to merge them together based on "Code"

...
df = pandas.Dataframe()  # Creating an empty dataframe for merging at the end

items = ["tb1", "tb2", "tb3"]

for each_item in items:
    my_value = pandas.read_sql_query('select "Code", "Name", "Value" from {tb_name} where "Value" is not null'
                                     .format(tbl_name='"%s"' % each_item), con=engine)

    print(my_value)

    # This below code is my attempt to merge the dataframes that was obtained through the for loop
    merge_value = pandas.merge(my_value, df, on='Code', how='outer')

my_value results:

# tb1 results
     Code          Name      Value
0     C01         Name1   0.010000
1     C02         Name2   0.001200
2     C03         Name3   0.000300
3     C04         Name4   0.001700

# tb2 results
     Code          Name      Value
0     C03         Name3   0.010000
1     C04         Name4   0.001200
2     C05         Name5   0.000300
3     C06         Name6   0.001700

# tb3 results
     Code          Name      Value
0     C01         Name1   0.010000
1     C02         Name2   0.001200
2     C05         Name5   0.000300
3     C06         Name6   0.001700

I am trying to merge them into one table like below:

# desired results
     Code          Name    Value_x   Value_y    Value_Z
0     C01         Name1   0.010000      NULL   0.010000      
1     C02         Name2   0.001200      NULL   0.001200      
2     C03         Name3   0.000300  0.010000       NULL
3     C04         Name4   0.001700  0.001200       NULL      
4     C05         Name5       NULL  0.000300   0.000300  
5     C06         Name6       NULL  0.001700   0.001700  

How do I merge it? I tried the below, but it produces key error: Code

merge_value = pandas.merge(my_value, df, on='Code', how='outer')
like image 435
jake wong Avatar asked Jan 29 '23 16:01

jake wong


1 Answers

You can use DataFrame.merge instead of pandas.merge

https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html


Create an empty DataFrame with the columns to prevent the "key error: Code"

df = pd.DataFrame(columns=['Code']) 

then in the loop, you

df.merge(my_value, on='Code', how='outer') 

after my_value is created

like image 198
Jeffrey Ram Avatar answered Feb 01 '23 06:02

Jeffrey Ram