Can somebody explain data frame joins with pandas to me based on this example?
The first dataframe, let's call it A, looks like this:

The second dataframe, B, looks like this:

I want to create a plot now in which I compare the values for column running in A with those in B but only if the string in column graph is the same. (In this example, the first row in A and B have the same graph so I want to compare their running value.)
I believe this is what Pandas.DataFrame.join is for, but I cannot formulate the code needed to join the data frames A and B correctly.
I think I would use merge here:
>>> a = pd.DataFrame({"graph": ["as-22july06", "belgium", "cage15"], "running": [2, 879, 4292], "mod": [0.28, 0.94, 0.66], "eps": [220, 176, 1096]})
>>> b = pd.DataFrame({"graph": ["as-22july06", "astro-ph", "cage15"], "running": [395.186, 714.542, 999], "mod": [0.67, 0.74, 0.999]})
>>> a
eps graph mod running
0 220 as-22july06 0.28 2
1 176 belgium 0.94 879
2 1096 cage15 0.66 4292
>>> b
graph mod running
0 as-22july06 0.670 395.186
1 astro-ph 0.740 714.542
2 cage15 0.999 999.000
>>> a.merge(b, on="graph")
eps graph mod_x running_x mod_y running_y
0 220 as-22july06 0.28 2 0.670 395.186
1 1096 cage15 0.66 4292 0.999 999.000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With