I have a dataframe that is like this:
| A | B | C | D |
|---|---|----|---|
| 1 | 3 | 10 | 4 |
| 2 | 3 | 1 | 5 |
| 1 | 7 | 9 | 3 |
Where A B C D are categories, and the values are in the range [1, 10] (some values might not appear in a single column)
I would like to have a dataframe that for every category shows the count of those values. Something like this:
| | A | B | C | D |
|----|---|----|---|---|
| 1 | 2 | 0 | 1 | 0 |
| 2 | 1 | 0 | 0 | 0 |
| 3 | 0 | 2 | 0 | 1 |
| 4 | 0 | 0 | 0 | 1 |
| 5 | 0 | 0 | 0 | 1 |
| 6 | 0 | 0 | 0 | 0 |
| 7 | 0 | 1 | 0 | 0 |
| 8 | 0 | 0 | 0 | 0 |
| 9 | 0 | 0 | 1 | 0 |
| 10 | 0 | 0 | 1 | 0 |
I tried using groupby
and pivot_table
but I can't seem to understand what parameters to give.
pandas.Series.value_counts
applies for each columnseaborn.heatmap
will plot a DataFrame
import seaborn as sns
import pandas as pd
# dataframe setup
data = {'A': [1, 2, 1], 'B': [3, 3, 7], 'C': [10, 1, 9], 'D': [4, 5, 3]}
df = pd.DataFrame(data)
# create a dataframe of the counts for each column
counts = df.apply(pd.value_counts)
# display(count)
A B C D
1 2.0 NaN 1.0 NaN
2 1.0 NaN NaN NaN
3 NaN 2.0 NaN 1.0
4 NaN NaN NaN 1.0
5 NaN NaN NaN 1.0
7 NaN 1.0 NaN NaN
9 NaN NaN 1.0 NaN
10 NaN NaN 1.0 NaN
# plot
sns.heatmap(counts)
cmap
can improve the visualization.
# counts
counts = df.apply(pd.value_counts).fillna(0)
# plot
sns.heatmap(counts, cmap="GnBu", annot=True)
sns.heatmap(counts, annot=True)
this is my first time posting answers, hope it is hopeful
import seaborn as sns
import pandas as pd
import numpy as np
data = {'A': [1, 2, 1], 'B': [3, 3, 7], 'C': [10, 1, 9], 'D': [4, 5, 3]}
df = pd.DataFrame(data)
df1 = pd.DataFrame(data = None , index = np.arange(11),columns = df.columns)
for value in df.columns:
df1[value]= df[value].value_counts()
df1.fillna(0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With