Accounting for errors when creating a histogram

Question

I have a set of N observations distributed as (x[i], y[i]), i=0..N points in a 2D space. Each point has associated errors in both coordinates (e_x[i], e_y[i], i=0..N) and also a weight attached to it (w[i], i=0..N).

I'd like to generate a 2D histogram of these N points, accounting not only for the weights but also for the errors, which would cause each point to be spread possibly among many bins if the error values are large enough (assuming a standard Gaussian distribution for the errors, although other distributions could perhaps be considered).

I see that numpy.histogram2d has a weights parameter so that is taken care of. The issue would be how to account for the errors in each of the N observed points.

Is there a function that would let me do this? I'm open to anything in numpy and scipy.

Gabriel Gleizer · Accepted Answer

Building upon user1415946's comment, you can assume each point represents a bi-variate normal distribution with the covariance matrices given by [[e_x[i]**2,0][0,e_y[i]**2]]. However, the resulting distribution is not a normal distribution - you'll see, after running the example, how the histogram doesn't resemble a Gaussian at all, but instead a group of them.

To create a histogram out of this set of distributions, one way I see is by generating random samples out of each point using numpy.random.multivariate_normal. See the example code below with some artificial data.

import numpy as np
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.pyplot as plt


# This is a function I like to use for plotting histograms
def plotHistogram3d(hist, xedges, yedges):
    fig = plt.figure()
    ax = fig.add_subplot(111, projection='3d')
    hist = hist.transpose()
    # Transposing is done so that bar3d x and y match hist shape correctly
    dx = np.mean(np.diff(xedges))
    dy = np.mean(np.diff(yedges))

    # Computing the number of elements
    elements = (len(xedges) - 1) * (len(yedges) - 1)
    # Generating mesh grids.
    xpos, ypos = np.meshgrid(xedges[:-1]+dx/2.0, yedges[:-1]+dy/2.0)

    # Vectorizing matrices
    xpos = xpos.flatten()
    ypos = ypos.flatten()
    zpos = np.zeros(elements)
    dx = dx * np.ones_like(zpos) * 0.5  # 0.5 factor to give room between bars.
# Use 1.0 if you want all bars 'glued' to each other
    dy = dy * np.ones_like(zpos) * 0.5
    dz = hist.flatten()

    ax.bar3d(xpos, ypos, zpos, dx, dy, dz, color='b')
    ax.set_xlabel('x')
    ax.set_ylabel('y')
    ax.set_zlabel('Count')
    return

"""
INPUT DATA
"""
#                 x  y ex ey  w
data = np.array([[1, 2, 1, 1, 1],
                 [3, 0, 1, 1, 2],
                 [0, 1, 2, 1, 5],
                 [7, 7, 1, 3, 1]])

"""
Generate samples
"""
# Sample size (100 samples will be generated for each data point)
SAMPLE_SIZE = 100
# I want to fill in a table with columns [x, y, w]. Each data point generates SAMPLE_SIZE
# samples, so we have SAMPLE_SIZE * (number of data points) generated points
points = np.zeros((SAMPLE_SIZE * data.shape[0], 3))  # Initializing this matrix

for i, element in enumerate(data):  # For each row in the data set
    meanVector = element[:2]
    covarianceMatrix = np.diag(element[2:4]**2)  # Diagonal matrix with elements equal to error^2
    # For columns 0 and 1, add generated x and y samples
    points[SAMPLE_SIZE*i:SAMPLE_SIZE*(i+1), :2] = \
        np.random.multivariate_normal(meanVector, covarianceMatrix, SAMPLE_SIZE)
    # For column 2, simply copy original weight
    points[SAMPLE_SIZE*i:SAMPLE_SIZE*(i+1), 2] = element[4]  # weights

hist, xedges, yedges = np.histogram2d(points[:, 0], points[:, 1], weights=points[:, 2])
plotHistogram3d(hist, xedges, yedges)
plt.show()

Results plotted below:

enter image description here

Accounting for errors when creating a histogram

Tags:

python

numpy

scipy

histogram

histogram2d

Gabriel

1 Answers

Gabriel Gleizer

Recent Activity

Donate For Us

Accounting for errors when creating a histogram

Tags:

python

numpy

scipy

histogram

histogram2d

Gabriel

1 Answers

Gabriel Gleizer

Related questions

Recent Activity

Donate For Us