Python: Plotting percentage in seaborn bar plot

Question

For a dataframe

import pandas as pd
df=pd.DataFrame({'group':list("AADABCBCCCD"),'Values':[1,0,1,0,1,0,0,1,0,1,0]})

I am trying to plot a barplot showing percentage of times A, B, C, D takes zero (or one).

I have a round about way which works but I am thinking there has to be more straight forward way

tempdf=df.groupby(['group','Values']).Values.count().unstack().fillna(0)
tempdf['total']=df['group'].value_counts()
tempdf['percent']=tempdf[0]/tempdf['total']*100

tempdf.reset_index(inplace=True)
print tempdf

sns.barplot(x='group',y='percent',data=tempdf)

If it were plotting just the mean value, I could simply do sns.barplot on df dataframe than tempdf. I am not sure how to do it elegantly if I am interested in plotting percentages.

Thanks,

mgoldwasser · Accepted Answer

You can use Pandas in conjunction with seaborn to make this easier:

import pandas as pd
import seaborn as sns

df = sns.load_dataset("tips")
x, y, hue = "day", "proportion", "sex"
hue_order = ["Male", "Female"]

(df[x]
 .groupby(df[hue])
 .value_counts(normalize=True)
 .rename(y)
 .reset_index()
 .pipe((sns.barplot, "data"), x=x, y=y, hue=hue))

enter image description here

Ted Petrou · Answer

You can use the library Dexplot, which has the ability to return relative frequencies for categorical variables. It has a similar API to Seaborn. Pass the column you would like to get the relative frequency for to the count function. If you would like to subdivide this by another column, do so with the split parameter. The following returns raw counts.

import dexplot as dxp
dxp.count('group', data=df, split='Values')

enter image description here

To get the relative frequencies, set the normalize parameter to the column you want to normalize over. Use True to normalize over the overall total count.

dxp.count('group', data=df, split='Values', normalize='group')

enter image description here

Normalizing over the 'Values' column would produce the following graph, where the total of all the '0' bars are 1.

dxp.count('group', data=df, split='Values', normalize='Values')

enter image description here

Python: Plotting percentage in seaborn bar plot

Tags:

python

pandas

seaborn

bar-chart

PagMax

2 Answers

mgoldwasser

Ted Petrou

Recent Activity

Donate For Us

Python: Plotting percentage in seaborn bar plot

Tags:

python

pandas

seaborn

bar-chart

PagMax

2 Answers

mgoldwasser

Ted Petrou

Related questions

Recent Activity

Donate For Us