I'd like to make a bar plot in python with multiple x-categories from counts of data either "yes" or "no". I've started on some code but I believe the track I'm on in a slow way of getting to the solution I want. I'd be fine with a solution that uses either seaborn, Matplotlib, or pandas but not Bokeh because I'd like to make publication-quality figures that scale.
Ultimately what I want is:
Here's the dataset I'm working with:
import pandas as pd
data = [{'ship': 'Yes','canoe': 'Yes', 'cruise': 'Yes', 'kayak': 'No','color': 'Red'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'Yes','canoe': 'No','color': 'Green'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'No','canoe': 'No','color': 'Green'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'No','canoe': 'No','color': 'Red'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'Yes','canoe': 'No','color': 'Red'},{'ship': 'No', 'cruise': 'Yes', 'kayak': 'No','canoe': 'Yes','color': 'Green'},{'ship': 'No', 'cruise': 'No', 'kayak': 'No','canoe': 'No','color': 'Green'},{'ship': 'No', 'cruise': 'No', 'kayak': 'No','canoe': 'No','color': 'Red'}]
df = pd.DataFrame(data)
This is what I've started with:
print(df['color'].value_counts())
red = 4 # there must be a better way to code this rather than manually. Perhaps using len()?
green = 4
# get count per type
ca = df['canoe'].value_counts()
cr = df['cruise'].value_counts()
ka = df['kayak'].value_counts()
sh = df['ship'].value_counts()
print(ca, cr, ka, sh)
# group by color
cac = df.groupby(['canoe','color'])
crc = df.groupby(['cruise','color'])
kac = df.groupby(['kayak','color'])
shc = df.groupby(['ship','color'])
# make plots
cac2 = cac['color'].value_counts().unstack()
cac2.plot(kind='bar', title = 'Canoe by color')
But really what I want is all of the x-categories to be on one plot, only showing the result for "Yes" responses, and taken as the proportion of "Yes" rather than just counts. Help?
You can use the following syntax to create a bar plot from a GroupBy function in pandas: #calculate sum of values by group df_groups = df.groupby( ['group_var']) ['values_var'].sum() #create bar plot by group df_groups.plot(kind='bar') The following example shows how to use this syntax in practice.
Plot Multiple Columns of Pandas Dataframe on Bar Chart with Matplotlib. 1 Python3. import pandas as pd. import matplotlib.pyplot as plt. df = pd.DataFrame ( {. 'Name': ['John', 'Sammy', 'Joe'], 'Age': [45, 38, 90], 'Height ... 2 Python3. 3 Python3.
We will use the DataFrame df to construct bar plots. We need to plot age, height, and weight for each person in the DataFrame on a single bar chart. It generates a bar chart for Age, Height and Weight for each person in the dataframe df using the plot () method for the df object.
import matplotlib.pyplot as plt #calculate sum of points for each team df.groupby('team') ['points'].sum() #create bar plot by group df_groups.plot(kind='bar') The x-axis shows the name of each team and the y-axis shows the sum of the points scored by each team. Note: You can find the complete documentation for the GroupBy function here.
Not exactly sure if I understand the question correctly. It looks like it would make more sense to look at the proportion of answers per boat type and color.
import matplotlib.pyplot as plt
import pandas as pd
data = [{'ship': 'Yes','canoe': 'Yes', 'cruise': 'Yes', 'kayak': 'No','color': 'Red'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'Yes','canoe': 'No','color': 'Green'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'No','canoe': 'No','color': 'Green'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'No','canoe': 'No','color': 'Red'},{'ship': 'Yes', 'cruise': 'Yes', 'kayak': 'Yes','canoe': 'No','color': 'Red'},{'ship': 'No', 'cruise': 'Yes', 'kayak': 'No','canoe': 'Yes','color': 'Green'},{'ship': 'No', 'cruise': 'No', 'kayak': 'No','canoe': 'No','color': 'Green'},{'ship': 'No', 'cruise': 'No', 'kayak': 'No','canoe': 'No','color': 'Red'}]
df = pd.DataFrame(data)
ax = df.replace(["Yes","No"],[1,0]).groupby("color").mean().transpose().plot.bar(color=["g","r"])
ax.set_title('Proportion "Yes" answers per of boat type and color')
plt.show()
This means e.g. that 25% of all green canoes answered "yes".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With