I have a pandas dataframe like this
UIID ISBN
a 12
b 13
I want to compare each UUID with the ISBN and add a count column in the dataframe.
UUID ISBN Count
a 12 1
a 13 0
b 12 0
b 13 1
How can this be done in pandas. I know the crosstab function does the same thing but I want the data in this format.
Use crosstab
with melt
:
df = pd.crosstab(df['UIID'], df['ISBN']).reset_index().melt('UIID', value_name='count')
print (df)
UIID ISBN count
0 a 12 1
1 b 12 0
2 a 13 0
3 b 13 1
Alternative solution with GroupBy.size
and reindex
by MultiIndex.from_product
:
s = df.groupby(['UIID','ISBN']).size()
mux = pd.MultiIndex.from_product(s.index.levels, names=s.index.names)
df = s.reindex(mux, fill_value=0).reset_index(name='count')
print (df)
UIID ISBN count
0 a 12 1
1 a 13 0
2 b 12 0
3 b 13 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With