I'm having some trouble getting these two dfs to join the way I would like. The first df has a hierarchical index that I created using df1 = df3.groupby(["STATE_PROV_CODE", "COUNTY"]).size()
to get the count for each county.
STATE_PROV_CODE COUNTY COUNT
AL Autauga County 1
Baldwin County 1
Barbour County 1
Bibb County 1
Blount County 1
STATE_PROV_CODE COUNTY ANSI Cl FIPS
0 AL Autauga County H1 01001
1 AL Baldwin County H1 01003
2 AL Barbour County H1 01005
3 AL Bibb County H1 01007
4 AL Blount County H1 01009
In SQL I would like to do the following:
SELECT STATE_PROV_CODE, COUNTY, FIPS, COUNT,
FROM df1, df2
ON STATE_PROV_CODE, COUNTY
WHERE df1.STATE_PROV_CODE = df2.STATE_PROV_CODE
AND df1.COUNTY = df2.COUNTY
I would like the result to be as follows:
STATE_PROV_CODE COUNTY COUNT FIPS
AL Autauga County 1 01001
Baldwin County 1 01003
Barbour County 1 01005
Bibb County 1 01007
Blount County 1 01009
I believe the way you have set up the groupby results and the second dataframe, this merge call will work:
df = pd.merge(df1, df2, left_index=True, right_on=['STATE_PROV_CODE', 'COUNTY'])
it will unravel the MultiIndex; however, if you want it back, all you have to do is
df = df.set_index(['STATE_PROV_CODE', 'COUNTY'])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With