pandas

Question

Supposing that I have a DataFrame along the lines of:

    term      score
0   this          0
1   that          1
2   the other     3
3   something     2
4   anything      1
5   the other     2
6   that          2
7   this          0
8   something     1

How would I go about counting up the instances in the score column by unique values in the term column? Producing a result like:

    term      score 0     score 1     score 2     score 3
0   this            2           0           0           0
1   that            0           1           1           0
2   the other       0           0           1           1
3   something       0           1           1           0
4   anything        0           1           0           0

Related questions I've read here include Python Pandas counting and summing specific conditions and COUNTIF in pandas python over multiple columns with multiple conditions, but neither seems to quite be what I'm looking to do. pivot_table as mentioned at this question seems like it could be relevant but I'm impeded by lack of experience and the brevity of the pandas documentation. Thanks for any suggestions.

Scott Boston · Accepted Answer

You can also use, get_dummies, set_index, and sum with level parameter:

(pd.get_dummies(df.set_index('term'), columns=['score'], prefix_sep=' ')
   .sum(level=0)
   .reset_index())

Output:

        term  score 0  score 1  score 2  score 3
0       this        2        0        0        0
1       that        0        1        1        0
2  the other        0        0        1        1
3  something        0        1        1        0
4   anything        0        1        0        0

jezrael · Answer

Use groupby with size and reshape by unstack, last add_prefix:

df = df.groupby(['term','score']).size().unstack(fill_value=0).add_prefix('score ')

Or use crosstab:

df = pd.crosstab(df['term'],df['score']).add_prefix('score ')

Or pivot_table:

df = (df.pivot_table(index='term',columns='score', aggfunc='size', fill_value=0)
        .add_prefix('score '))

print (df)
score      score 0  score 1  score 2  score 3
term                                         
anything         0        1        0        0
something        0        1        1        0
that             0        1        1        0
the other        0        0        1        1
this             2        0        0        0

pandas - Counting occurrences of a value in a DataFrame per each unique value in another column

Tags:

python

dataframe

pivot-table

Scott Martin

2 Answers

Scott Boston

jezrael

Recent Activity

Donate For Us