I am trying to merge two dataframes together to create one concise dataframe. The first dataframe holds all possible names of various network devices. The second dataframe contains the names of network devices that actually exist, as well as their corresponding hardware.
I need to merge these two dataframes together so that the device names in the first dataframe are "checked" against what exists in the second dataframe, and then spit out the corresponding piece of hardware in order to perform further analysis later.
Here is a simplified illustration of what I have going on:
print(df1)
Router_Name Firewall_Name
0 router1 firewall1
1 router2 firewall2
2 router3 firewall3
3 router4 firewall4
print(df2)
Device_Name Hardware_Platform
0 router2 cisco111
1 router3 cisco222
2 firewall1 cisco333
3 firewall2 cisco444
This would be my desired result after performing a merge:
print (df3)
Router_Name Hardware_Platform Firewall_Name Hardware_Platform
0 router1 N/A firewall1 cisco333
1 router2 cisco111 firewall2 cisco444
2 router3 cisco222 firewall3 N/A
3 router4 N/A firewall4 N/A
I have tried many commands including:
result = pd.concat([df1, df2], axis=1).reindex(df2.index)
print(result)
But this simply results in stacking df1 and df2 on top of each other. Is using this method even possible?
I got it working with the following example (kinda cumbersome with renaming the columns), but the example is clear enough. I used your example dataframes as input files. Furthermore, I used two left joins and created two columns from the hardware_platform column.
Step 1: Create dataframes
import pandas as pd
df1 = pd.read_excel('file1.xlsx')
df2 = pd.read_excel('file2.xlsx')
router_name firewall_name
0 router1 firewall1
1 router2 firewall2
2 router3 firewall3
3 router4 firewall4
device_name hardware_platform
0 router2 cisco111
1 router3 cisco222
2 firewall1 cisco333
3 firewall2 cisco444
Step 2: First merge (routers)
df2 = df2.rename(columns={"device_name": "router_name"})
m1 = pd.merge(df1, df2, on='router_name', how='left')
m1 = m1.rename(columns={"hardware_platform": "router_hardware"})
router_name firewall_name router_hardware
0 router1 firewall1 NaN
1 router2 firewall2 cisco111
2 router3 firewall3 cisco222
3 router4 firewall4 NaN
Step 3: Second merge (firewalls)
df2 = df2.rename(columns={"router_name": "firewall_name"})
m2 = pd.merge(m1, df2, on='firewall_name', how='left')
router_name firewall_name router_hardware firewall_hardware
0 router1 firewall1 NaN cisco333
1 router2 firewall2 cisco111 cisco444
2 router3 firewall3 cisco222 NaN
3 router4 firewall4 NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With