I have a dataframe df as follows:
| id | movie | value |
|----|-------|-------|
| 1 | a | 0 |
| 2 | a | 0 |
| 3 | a | 20 |
| 4 | a | 0 |
| 5 | a | 10 |
| 6 | a | 0 |
| 7 | a | 20 |
| 8 | b | 0 |
| 9 | b | 0 |
| 10 | b | 30 |
| 11 | b | 30 |
| 12 | b | 30 |
| 13 | b | 10 |
| 14 | c | 40 |
| 15 | c | 40 |
I want to create a 2X2 pivot table of counts as follows:
| Value | count(a) | count(b) | count ( C ) |
|-------|----------|----------|-------------|
| 0 | 4 | 2 | 0 |
| 10 | 1 | 1 | 0 |
| 20 | 2 | 0 | 0 |
| 30 | 0 | 3 | 0 |
| 40 | 0 | 0 | 2 |
I can do this very easily in Excel using Row and Column Labels. How can I do this using Python?
We can count values in a PivotTable by using the value field settings. This enables us to have a valid representation of what we have in our data. For instance, in the example below, there is a count of 16 for clients when distinctly, they are only 4.
Counting distinct values in Pandas pivot If we want to count the unique occurrences of a specific observation (row) we'll need to use a somewhat different aggregation method. aggfunc= pd. Series. nunique will allow us to count only the distinct rows in the DataFrame that we pivoted.
A pandas pivot table has three main elements. The index specifies the row-level grouping, columns specify the column level grouping and values which are the numerical values you are looking to summarise. Basic anatomy of a pandas pivot table.
By using pd.crosstab
pd.crosstab(df['value'],df['movie'])
Out[24]:
movie a b c
value
0 4 2 0
10 1 1 0
20 2 0 0
30 0 3 0
40 0 0 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With