Supposing that I have a DataFrame along the lines of:
term score
0 this 0
1 that 1
2 the other 3
3 something 2
4 anything 1
5 the other 2
6 that 2
7 this 0
8 something 1
How would I go about counting up the instances in the score
column by unique values in the term
column? Producing a result like:
term score 0 score 1 score 2 score 3
0 this 2 0 0 0
1 that 0 1 1 0
2 the other 0 0 1 1
3 something 0 1 1 0
4 anything 0 1 0 0
Related questions I've read here include Python Pandas counting and summing specific conditions and COUNTIF in pandas python over multiple columns with multiple conditions, but neither seems to quite be what I'm looking to do. pivot_table
as mentioned at this question seems like it could be relevant but I'm impeded by lack of experience and the brevity of the pandas documentation. Thanks for any suggestions.
You can also use, get_dummies
, set_index
, and sum
with level
parameter:
(pd.get_dummies(df.set_index('term'), columns=['score'], prefix_sep=' ')
.sum(level=0)
.reset_index())
Output:
term score 0 score 1 score 2 score 3
0 this 2 0 0 0
1 that 0 1 1 0
2 the other 0 0 1 1
3 something 0 1 1 0
4 anything 0 1 0 0
Use groupby
with size
and reshape by unstack
, last add_prefix
:
df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')
Or use crosstab
:
df = pd.crosstab(df['term'],df['score']).add_prefix('score ')
Or pivot_table
:
df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)
.add_prefix('score '))
print (df)
score score 0 score 1 score 2 score 3
term
anything 0 1 0 0
something 0 1 1 0
that 0 1 1 0
the other 0 0 1 1
this 2 0 0 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With