I have a dataframe shown in below.
df =pd.DataFrame({'ID': [1, 2, 3, 4, 5], 'contract1' :["A", "B", "C", "D", "B"],
'contract2' :["C", "A", np.nan, "A", np.nan],
'contract3' :[np.nan, "C", np.nan, np.nan, np.nan] })
df
ID contract1 contract2 contract3
1 A C nan
2 B A C
3 C nan nan
4 D A nan
5 B nan nan
I would like the flag result like this;
ID A B C D
1 1 0 1 0
2 1 1 1 0
3 0 0 1 0
4 1 0 0 1
5 0 1 0 0
This flag table show whether each ID have a each contract. Maybe pivot is available,but I couldn't handle this kind of complex dataframe... Can I ask how to transform ?
A Faster implementation would be to use melt in conjunction with str.get_dummies as shown:
pd.melt(df, id_vars=['ID']).set_index('ID')['value'] \
.str.get_dummies() \
.groupby(level=0) \
.agg(np.sum)
A B C D
ID
1 1 0 1 0
2 1 1 1 0
3 0 0 1 0
4 1 0 0 1
5 0 1 0 0
You can melt your original data frame to long format and then use crosstab() on the ID and value column:
import pandas as pd
df1 = df.set_index('ID').stack().rename("Type").reset_index()
pd.crosstab(df1.ID, df1.Type)
# Type A B C D
# ID
# 1 1 0 1 0
# 2 1 1 1 0
# 3 0 0 1 0
# 4 1 0 0 1
# 5 0 1 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With