I have a bunch of data scattered x, y. If I want to bin these according to x and put error bars equal to the standard deviation on them, how would I go about doing that?
The only I know of in python is to loop over the data in x and group them according to bins (max(X)-min(X)/nbins) then loop over those blocks to find the std. I'm sure there are faster ways of doing this with numpy.
I want it to look similar to "vert symmetric" in: http://matplotlib.org/examples/pylab_examples/errorbar_demo.html
An error bar is a line through a point on a graph, parallel to one of the axes, which represents the uncertainty or variation of the corresponding coordinate of the point. In IB Biology, the error bars most often represent the standard deviation of a data set.
A binned scatterplot condenses the information from a scatterplot by partitioning the x-axis into bins, and calculating the mean of y within each bin. Theoretically, this can also be done for quantiles in examining bivariate relationships without controls.
Error bars are graphical representations of the variability of data and used on graphs to indicate the error or uncertainty in a reported measurement. They give a general idea of how precise a measurement is, or conversely, how far from the reported value the true (error free) value might be.
You can bin your data with np.histogram
. I'm reusing code from this other answer to calculate the mean and standard deviation of the binned y
:
import numpy as np
import matplotlib.pyplot as plt
x = np.random.rand(100)
y = np.sin(2*np.pi*x) + 2 * x * (np.random.rand(100)-0.5)
nbins = 10
n, _ = np.histogram(x, bins=nbins)
sy, _ = np.histogram(x, bins=nbins, weights=y)
sy2, _ = np.histogram(x, bins=nbins, weights=y*y)
mean = sy / n
std = np.sqrt(sy2/n - mean*mean)
plt.plot(x, y, 'bo')
plt.errorbar((_[1:] + _[:-1])/2, mean, yerr=std, fmt='r-')
plt.show()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With