Let's assume I have a dataframe and I'm looking at 2 columns of it (2 series).
Using one of the columns - "no_employees"
below - Can someone kindly help me figure out how to create 6 different pie charts or bar charts (1 for each grouping of no_employees) that illustrate the value counts for the Yes/No values in the treatment column? I'll use matplotlib
or seaborn
, whatever you feel is easiest.
I'm using the attached line of code to generate the code below.
dataframe_title.groupby(['no_employees']).treatment.value_counts().
But now I'm stuck. Do I use seaborn
? .plot
? This seems like it should be easy, and I know there are some cases where I can make subplots=True
, but I'm really confused. Thank you so much.
no_employees treatment
1-5 Yes 88
No 71
100-500 Yes 95
No 80
26-100 Yes 149
No 139
500-1000 No 33
Yes 27
6-25 No 162
Yes 127
More than 1000 Yes 146
No 135
Then you can group by the type using the formula below: When you add a pie chart using 'coll2' in the Items property, you should get the chart that you described. The formula above first groups all the items in the original collection by the 'type' column, then adds a new column that sums the 'value' property of each group.
Now click on the 2-D Pie Chart command, which is marked with a red color rectangle. The above data set shows this pie chart. From the Chart Element option, click on the Data Labels. These are the given results showing the data value in a pie chart. Right-click on the pie chart. Select the Format Data Labels command.
Plot Pie Chart of Series Values To create a pie chart from the series values we’ll pass kind='pie' to the pandas series plot () function. For example, let’s see its usage on the “wimbledon_wins_count” series created above. The above pie chart shows the distribution of Wimbledon victories from 2015 to 2019.
Pandas Series as Pie Chart To plot a pie chart, you first need to create a series of counts of each unique value (use the pandas value_counts () function) and then proceed to plot the resulting series of counts as a pie chart using the pandas series plot () function.
'treatments'
per category)'Yes'
or 'No'
pandas 1.3.0
, seaborn 0.11.1
, and matplotlib 3.4.2
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np # for sample data only
np.random.seed(365)
cats = ['1-5', '6-25', '26-100', '100-500', '500-1000', '>1000']
data = {'no_employees': np.random.choice(cats, size=(1000,)),
'treatment': np.random.choice(['Yes', 'No'], size=(1000,))}
df = pd.DataFrame(data)
# set a categorical order for the x-axis to be ordered
df.no_employees = pd.Categorical(df.no_employees, categories=cats, ordered=True)
no_employees treatment
0 26-100 No
1 1-5 Yes
2 >1000 No
3 100-500 Yes
4 500-1000 Yes
pandas.DataFrame.plot()
:.value_counts
, and unstacking with pandas.DataFrame.unstack
.# to get the dataframe in the correct shape, unstack the groupby result
dfu = df.groupby(['no_employees']).treatment.value_counts().unstack()
treatment No Yes
no_employees
1-5 78 72
6-25 83 86
26-100 83 76
100-500 91 84
500-1000 78 83
>1000 95 91
# plot
ax = dfu.plot(kind='bar', figsize=(7, 5), xlabel='Number of Employees in Company', ylabel='Count', rot=0)
ax.legend(title='treatment', bbox_to_anchor=(1, 1), loc='upper left')
seaborn
seaborn.barplot()
.value_counts
, and resetting the index with pandas.Series.reset_index
sns.catplot()
with kind='bar'
# groupby, get value_counts, and reset the index
dft = df.groupby(['no_employees']).treatment.value_counts().reset_index(name='Count')
no_employees treatment Count
0 1-5 No 78
1 1-5 Yes 72
2 6-25 Yes 86
3 6-25 No 83
4 26-100 No 83
5 26-100 Yes 76
6 100-500 No 91
7 100-500 Yes 84
8 500-1000 Yes 83
9 500-1000 No 78
10 >1000 No 95
11 >1000 Yes 91
# plot
p = sns.barplot(x='no_employees', y='Count', data=dft, hue='treatment')
p.legend(title='treatment', bbox_to_anchor=(1, 1), loc='upper left')
p.set(xlabel='Number of Employees in Company')
seaborn.countplot()
df
, without any transformations.sns.catplot()
with kind='count'
p = sns.countplot(data=df, x='no_employees', hue='treatment')
p.legend(title='treatment', bbox_to_anchor=(1, 1), loc='upper left')
p.set(xlabel='Number of Employees in Company')
barplot
and countplot
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With