Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Merging Dataframes using Pandas

I am trying to merge two dataframes together to create one concise dataframe. The first dataframe holds all possible names of various network devices. The second dataframe contains the names of network devices that actually exist, as well as their corresponding hardware.

I need to merge these two dataframes together so that the device names in the first dataframe are "checked" against what exists in the second dataframe, and then spit out the corresponding piece of hardware in order to perform further analysis later.

Here is a simplified illustration of what I have going on:


print(df1)

Router_Name     Firewall_Name       
0   router1     firewall1          
1   router2     firewall2          
2   router3     firewall3          
3   router4     firewall4

print(df2)

Device_Name     Hardware_Platform
0   router2         cisco111
1   router3         cisco222
2   firewall1       cisco333
3   firewall2       cisco444

This would be my desired result after performing a merge:

print (df3)

Router_Name   Hardware_Platform  Firewall_Name    Hardware_Platform  
0   router1           N/A                firewall1        cisco333   
1   router2           cisco111           firewall2        cisco444
2   router3           cisco222           firewall3        N/A  
3   router4           N/A                firewall4        N/A



I have tried many commands including:

result = pd.concat([df1, df2], axis=1).reindex(df2.index)
print(result)

But this simply results in stacking df1 and df2 on top of each other. Is using this method even possible?

like image 840
brooklynveezy Avatar asked Nov 21 '25 06:11

brooklynveezy


1 Answers

I got it working with the following example (kinda cumbersome with renaming the columns), but the example is clear enough. I used your example dataframes as input files. Furthermore, I used two left joins and created two columns from the hardware_platform column.

Step 1: Create dataframes

import pandas as pd

df1 = pd.read_excel('file1.xlsx')
df2 = pd.read_excel('file2.xlsx')

  router_name firewall_name
0     router1     firewall1
1     router2     firewall2
2     router3     firewall3
3     router4     firewall4

  device_name hardware_platform
0     router2          cisco111
1     router3          cisco222
2   firewall1          cisco333
3   firewall2          cisco444

Step 2: First merge (routers)

df2 = df2.rename(columns={"device_name": "router_name"})
m1 = pd.merge(df1, df2, on='router_name', how='left')
m1 = m1.rename(columns={"hardware_platform": "router_hardware"})

  router_name firewall_name router_hardware
0     router1     firewall1             NaN
1     router2     firewall2        cisco111
2     router3     firewall3        cisco222
3     router4     firewall4             NaN

Step 3: Second merge (firewalls)

df2 = df2.rename(columns={"router_name": "firewall_name"})
m2 = pd.merge(m1, df2, on='firewall_name', how='left')

  router_name firewall_name router_hardware firewall_hardware
0     router1     firewall1             NaN          cisco333
1     router2     firewall2        cisco111          cisco444
2     router3     firewall3        cisco222               NaN
3     router4     firewall4             NaN               NaN
like image 150
Nebulastic Avatar answered Nov 22 '25 19:11

Nebulastic



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!