Weighted average using numpy.average

Question

I have an array:

In [37]: bias_2e13 # our array
Out[37]: 
[1.7277990734072355,
 1.9718263893212737,
 2.469657573252167,
 2.869022991373125,
 3.314720313010104,
 4.232269039271717]

The error on each value in the array is:

In [38]: bias_error_2e13 # the error on each value
Out[38]: 
array([ 0.13271387,  0.06842465,  0.06937965,  0.23886647,  0.30458249,
        0.57906816])

Now I divide the error on each value by 2:

In [39]: error_half # error divided by 2
Out[39]: 
array([ 0.06635694,  0.03421232,  0.03468982,  0.11943323,  0.15229124,
        0.28953408])

Now I calculate the average of the array using numpy.average, but using the errors as weights.

First I am using the full error on the values, then I am using half the error, i.e. the error divided by 2.

In [40]: test = np.average(bias_2e13,weights=bias_error_2e13)

In [41]: test_2 = np.average(bias_2e13,weights=error_half)

How do both the averages give me the same result when one array has errors which are half of that of the other?

In [42]: test
Out[42]: 3.3604746813456936

In [43]: test_2
Out[43]: 3.3604746813456936

miradulo · Accepted Answer

Because all of the errors have the same relative weight. Supplying a weight parameter does not change the actual values you are averaging, it just indicates the weight with which each value value contributes to the average. In other words, after multiplying each value passed by its corresponding weight, np.average divides by the sum of the weights provided.

>>> import numpy as np
>>> np.average([1, 2, 3], weights=[0.2, 0.2, 0.2])
2.0
>>> np.average([1, 2, 3])
2.0

Effectively, the average formula for an n-dimensional array-like container is

enter image description here

where each weight is assumed to be equal to 1 when not provided to numpy.average.

anon · Answer

My answer is late, but I hope this will be of use to others looking at this post in the future.

The above answers are spot on with respect to why the results are the same. However, there is a fundamental flaw in how you are calculating your weighted average. The uncertainties in your data ARE NOT the weights that numpy.average expects. You have to calculate your weights first and provide them to numpy.average. This can be done as:

weight = 1/(uncertainty)^2.

(see, for example, this description.)

Therefore, you would calculate your weighted average as:

wts_2e13 = 1./(np.power(bias_error_2e13, 2.)) # Calculate weights using errors

wts_half = 1./(np.power(error_half, 2.)) # Calculate weights using half errors

test = np.average(bias_2e13, weights = wts_2e13)

test_2 = np.average(bias_2e13, weights = wts_half)

giving you the answers of 2.2201767077906709 in both cases for reasons explained well in the above answers.

Weighted average using numpy.average

Tags:

python

numpy

python-2.7

average

weighted-average

Srivatsan

2 Answers

miradulo

anon

Recent Activity

Donate For Us

Weighted average using numpy.average

Tags:

python

numpy

python-2.7

average

weighted-average

Srivatsan

2 Answers

miradulo

anon

Related questions

Recent Activity

Donate For Us