I have data from a control and treatment group. Is matplotlib able to create a bar chart where the bar height is the mean of each group overlaid with the individual data points from that group? I'd like to visualize the spread of the actual data points, similar to what is displayed here.
I've thought about using a combination of boxplots and scatter, but my attempts have not succeeded.
Here is a solution doing exactly what you mention: overlay a bar graph with a scatter plot.
Of course you can further play around to tweak the plot: plot title, axis labels, colors, width, marker shape of the scatter plot ...
import matplotlib.pyplot as plt
np.random.seed(123)
w = 0.8 # bar width
x = [1, 2] # x-coordinates of your bars
colors = [(0, 0, 1, 1), (1, 0, 0, 1)] # corresponding colors
y = [np.random.random(30) * 2 + 5, # data series
np.random.random(10) * 3 + 8]
fig, ax = plt.subplots()
ax.bar(x,
height=[np.mean(yi) for yi in y],
yerr=[np.std(yi) for yi in y], # error bars
capsize=12, # error bar cap width in points
width=w, # bar width
tick_label=["control", "test"],
color=(0,0,0,0), # face color transparent
edgecolor=colors,
#ecolor=colors, # error bar colors; setting this raises an error for whatever reason.
)
for i in range(len(x)):
# distribute scatter randomly across whole width of bar
ax.scatter(x[i] + np.random.random(y[i].size) * w - w / 2, y[i], color=colors[i])
plt.show()
It will yield this graph
Here a solution using Seaborn, which gives shorter code but gives up some flexibility compared to using Matplotlib directly:
import seaborn as sns, matplotlib.pyplot as plt
sns.set(style="whitegrid")
tips = sns.load_dataset("tips")
sns.barplot(x="day", y="total_bill", data=tips, capsize=.1, ci="sd")
sns.swarmplot(x="day", y="total_bill", data=tips, color="0", alpha=.35)
plt.show()
And here the result:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With