Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

turn scatter data into binned data with errors bars equal to standard deviation

I have a bunch of data scattered x, y. If I want to bin these according to x and put error bars equal to the standard deviation on them, how would I go about doing that?

The only I know of in python is to loop over the data in x and group them according to bins (max(X)-min(X)/nbins) then loop over those blocks to find the std. I'm sure there are faster ways of doing this with numpy.

I want it to look similar to "vert symmetric" in: http://matplotlib.org/examples/pylab_examples/errorbar_demo.html

like image 349
Griff Avatar asked Mar 21 '13 20:03

Griff


People also ask

Are error bars standard deviation?

An error bar is a line through a point on a graph, parallel to one of the axes, which represents the uncertainty or variation of the corresponding coordinate of the point. In IB Biology, the error bars most often represent the standard deviation of a data set.

What are binned scatter plots?

A binned scatterplot condenses the information from a scatterplot by partitioning the x-axis into bins, and calculating the mean of y within each bin. Theoretically, this can also be done for quantiles in examining bivariate relationships without controls.

What are error bars on bar charts?

Error bars are graphical representations of the variability of data and used on graphs to indicate the error or uncertainty in a reported measurement. They give a general idea of how precise a measurement is, or conversely, how far from the reported value the true (error free) value might be.


1 Answers

You can bin your data with np.histogram. I'm reusing code from this other answer to calculate the mean and standard deviation of the binned y:

import numpy as np
import matplotlib.pyplot as plt

x = np.random.rand(100)
y = np.sin(2*np.pi*x) + 2 * x * (np.random.rand(100)-0.5)
nbins = 10

n, _ = np.histogram(x, bins=nbins)
sy, _ = np.histogram(x, bins=nbins, weights=y)
sy2, _ = np.histogram(x, bins=nbins, weights=y*y)
mean = sy / n
std = np.sqrt(sy2/n - mean*mean)

plt.plot(x, y, 'bo')
plt.errorbar((_[1:] + _[:-1])/2, mean, yerr=std, fmt='r-')
plt.show()

enter image description here

like image 151
Jaime Avatar answered Oct 09 '22 17:10

Jaime