I have the following dataframe:
mean std
insert quality
0.0 good 0.009905 0.003662
0.1 good 0.450190 0.281895
poor 0.376818 0.306806
0.2 good 0.801856 0.243288
poor 0.643859 0.322378
0.3 good 0.833235 0.172025
poor 0.698972 0.263266
0.4 good 0.842288 0.141925
poor 0.706708 0.241269
0.5 good 0.853634 0.118604
poor 0.685716 0.208073
0.6 good 0.845496 0.118609
poor 0.675907 0.207755
0.7 good 0.826335 0.133820
poor 0.656934 0.222823
0.8 good 0.829707 0.130154
poor 0.627111 0.213046
0.9 good 0.816636 0.137371
poor 0.589331 0.232756
1.0 good 0.801211 0.147864
poor 0.554589 0.245867
What should I do if wanted to plot 2 curves (points + errors) using as the X axis the index column "Insert"
and differentiating the two curves by "Quality"
[good, poor]? They should be of different colors too.
I'm kinda stuck, I produced every kind of plot apart the one I need.
When to use standard error? It depends. If the message you want to carry is about the spread and variability of the data, then standard deviation is the metric to use. If you are interested in the precision of the means or in comparing and testing differences between means then standard error is your metric.
Error bars may show confidence intervals, standard errors, standard deviations, or other quantities.
You could loop through the groups in df.groupby('quality')
and call group.plot
on each group.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
'insert': [0.0, 0.1, 0.1, 0.2, 0.2, 0.3, 0.3, 0.4, 0.4, 0.5, 0.5, 0.6, 0.6,
0.7, 0.7, 0.8, 0.8, 0.9, 0.9, 1.0, 1.0],
'mean': [0.009905, 0.45019, 0.376818, 0.801856, 0.643859, 0.833235,
0.698972, 0.842288, 0.706708, 0.853634, 0.685716, 0.845496, 0.675907,
0.826335, 0.656934, 0.829707, 0.627111, 0.816636, 0.589331, 0.801211,
0.554589],
'quality': ['good', 'good', 'poor', 'good', 'poor', 'good', 'poor', 'good',
'poor', 'good', 'poor', 'good', 'poor', 'good', 'poor', 'good', 'poor',
'good', 'poor', 'good', 'poor'],
'std': [0.003662, 0.281895, 0.306806, 0.243288, 0.322378, 0.172025,
0.263266, 0.141925, 0.241269, 0.118604, 0.208073, 0.118609, 0.207755,
0.13382, 0.222823, 0.130154, 0.213046, 0.137371, 0.232756, 0.147864,
0.245867]})
fig, ax = plt.subplots() # 1
for key, group in df.groupby('quality'):
group.plot('insert', 'mean', yerr='std', label=key, ax=ax) # 2
plt.show()
To make both plots appear on the same axes:
axes
object, ax.ax
parameter to the axes
object in each call to group.plot
It might look better as a bar plot:
# fill in missing data with 0, so the bar plots are aligned
df = df.pivot(index='insert', columns='quality').fillna(0).stack().reset_index()
colors = ['green', 'red']
positions = [0, 1]
for group, color, pos in zip(df.groupby('quality'), colors, positions):
key, group = group
print(group)
group.plot('insert', 'mean', yerr='std', kind='bar', width=0.4, label=key,
position=pos, color=color, alpha=0.5, ax=ax)
ax.set_xlim(-1, 11)
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With