So I calculated the confident interval for a set of data with a normal distribution and I want to plot it as whiskers on the bar chart of the data mean. I tried using yerr parameter for the plt.bar but it calculates the standard deviation error not the confident interval.I want the same whiskers visualizations on the bar plot. The confident intervals I have are:
[(29600.87 , 39367.28 ), ( 37101.74 , 42849.60 ), ( 33661.12 , 41470.25 ), ( 46019.20 , 49577.80)]
Here's my code, I tried feeding the yerr parameters with the confident levels but did't work out so well.
means=[np.mean(df.iloc[x]) for x in range(len(df.index))]
CI=[st.t.interval(0.95, len(df.iloc[x])-1, loc=np.mean(df.iloc[x]), scale=st.sem(df.iloc[x])) for x in range(len(df.index))]
plt.figure()
plt.bar(x_axis, means, color='r',yerr=np.reshape(CI,(2,4))
plt.xticks(np.arange(1992,1996,1))
Here's the plot I'm getting:
The following should do what you want (assuming that your errors are symmetric; if not then you should go with @ImportanceOfBeingErnest's answer); the plot would look like this:
The code that produces it with some inline comments:
import matplotlib.pyplot as plt
# rough estimates of your means; replace by your actual values
means = [34500, 40000, 37500, 47800]
# the confidence intervals you provided
ci = [(29600.87, 39367.28), (37101.74, 42849.60), (33661.12, 41470.25), (46019.20, 49577.80)]
# get the range of the confidence interval
y_r = [means[i] - ci[i][1] for i in range(len(ci))]
plt.bar(range(len(means)), means, yerr=y_r, alpha=0.2, align='center')
plt.xticks(range(len(means)), [str(year) for year in range(1992, 1996)])
plt.show()
The yerr
argument to bar
can be used to draw the errors as errorbars. The errors are defined as the deviation from some value, i.e. often quantities are given in the form y ± err
. This means the the confidence interval would be (y-err, y+err)
.
This can be inverted; given a confidence interval (a, b)
and a value y
, the errors would be y-a
and b-y
.
In a matplotlib bar plot the error format can be scalar | N, Nx1 or 2xN array-like
. Since we cannot know beforehands if the y
value lies symmetric in the interval and since it can be different for different realizations (bars), we need to choose the 2 x N
-format here.
The code below shows how to do that.
import numpy as np
import matplotlib.pyplot as plt
# given some mean values and their confidence intervals,
means = np.array([30, 100, 60, 80])
conf = np.array([[24, 35],[90, 110], [52, 67], [71, 88]])
# calculate the error
yerr = np.c_[means-conf[:,0],conf[:,1]-means ].T
print (yerr) # prints [[ 6 10 8 9]
# [ 5 10 7 8]]
# and plot it on a bar chart
plt.bar(range(len(means)), means, yerr=yerr)
plt.xticks(range(len(means)))
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With