Suppose we have two Pandas DataFrames as follows:
df1 = pd.DataFrame({'id': ['a', 'b', 'c']})
df1
id
0 a
1 b
2 c
df2 = pd.DataFrame({'ids': [['b','c'], ['a', 'b'], ['a', 'z']],
'info': ['asdf', 'zxcv', 'sdfg']})
df2
ids info
0 [b, c] asdf
1 [a, b] zxcv
2 [a, z] sdfg
How do I join/merge the rows of df1
with df2
where df1.id
is in df2.ids
?
In other words, how do I achieve the following:
df3
id ids info
0 a [a, b] asdf
1 a [a, z] sdfg
2 b [b, c] asdf
3 b [a, b] zxcv
4 c [b, c] asdf
And also a version of the above aggregated on id
, like so:
df3
id ids info
0 a [[a, b], [a, z]] [asdf, sdfg]
2 b [[a, b], [b, c]] [asdf, zxcv]
3 c [[b, c]] [asdf]
I tried the following:
df1.merge(df2, how = 'left', left_on = 'id', right_on = 'ids')
TypeError: unhashable type: 'list'
df1.id.isin(df2.ids)
TypeError: unhashable type: 'list'
We can join columns from two Dataframes using the merge() function. This is similar to the SQL 'join' functionality. A detailed discussion of different join types is given in the SQL lesson. You specify the type of join you want using the how parameter.
Method. To find the positions of two matching columns, we first initialize a pandas dataframe with two columns of city names. Then we use where() of numpy to compare the values of two columns. This returns an array that represents the indices where the two columns have the same value.
The Fastest Ways As it turns out, join always tends to perform well, and merge will perform almost exactly the same given the syntax is optimal.
Using stack
, merge
and groupby.agg
:
df = df2.set_index('info').ids.apply(pd.Series)\
.stack().reset_index(0, name='id').merge(df2)\
.merge(df1, how='right').sort_values('id')\
.reset_index(drop=True)
print(df)
info id ids
0 zxcv a [a, b]
1 sdfg a [a, z]
2 asdf b [b, c]
3 zxcv b [a, b]
4 asdf c [b, c]
For aggregation use:
df = df.groupby('id', as_index=False).agg(list)
print(df)
id info ids
0 a [zxcv, sdfg] [[a, b], [a, z]]
1 b [asdf, zxcv] [[b, c], [a, b]]
2 c [asdf] [[b, c]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With