Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Mean of arrays with different lengths

Is it possible to calculate the mean of multiple arrays, when they may have different lengths? I am using numpy. So let's say I have:

numpy.array([[1, 2, 3, 4, 8],    [3, 4, 5, 6, 0]])
numpy.array([[5, 6, 7, 8, 7, 8], [7, 8, 9, 10, 11, 12]])
numpy.array([[1, 2, 3, 4],       [5, 6, 7, 8]])

Now I want to calculate the mean, but ignoring elements that are 'missing' (Naturally, I can not just append zeros as this would mess up the mean)

Is there a way to do this without iterating through the arrays?

PS. These arrays are all 2-D, but will always have the same amount of coordinates for that array. I.e. the 1st array is 5 and 5, 2nd is 6 and 6, 3rd is 4 and 4.

An example:

np.array([[1, 2],    [3, 4]])
np.array([[1, 2, 3], [3, 4, 5]])
np.array([[7],       [8]])

This must give

(1+1+7)/3  (2+2)/2   3/1
(3+3+8)/3  (4+4)/2   5/1

And graphically:

[1, 2]    [1, 2, 3]    [7]
[3, 4]    [3, 4, 5]    [8]

Now imagine that these 2-D arrays are placed on top of each other with coordinates overlapping contributing to that coordinate's mean.

like image 421
hjweide Avatar asked Apr 07 '12 20:04

hjweide


1 Answers

numpy.ma.mean allows you to compute the mean of non-masked array elements. However, to use numpy.ma.mean, you have to first combine your three numpy arrays into one masked array:

import numpy as np
x = np.array([[1, 2], [3, 4]])
y = np.array([[1, 2, 3], [3, 4, 5]])
z = np.array([[7], [8]])

arr = np.ma.empty((2,3,3))
arr.mask = True
arr[:x.shape[0],:x.shape[1],0] = x
arr[:y.shape[0],:y.shape[1],1] = y
arr[:z.shape[0],:z.shape[1],2] = z
print(arr.mean(axis = 2))

yields

[[3.0 2.0 3.0]
 [4.66666666667 4.0 5.0]]
like image 67
unutbu Avatar answered Oct 09 '22 03:10

unutbu