Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using numpy to calculate mean?

Tags:

python

numpy

I have a 2D array which looks like this:

array = [[23 ,89, 4, 3, 0],[12, 73 ,3, 5,1],[7, 9 ,12, 11 ,0]]

Where the last column is always 0 or 1 for all the rows. My aim is to calculate two means for column 0, where one mean will be when the last column's value is 0 and one of the mean will be when last column's value will be 1.

e.g. for given sample array above: mean 1: 15 (mean for 0 column for all the rows where last column is 0) mean 2: 12 (mean for 0 column for all the rows where last column is 1)

I have tried this (where train is my input array's name):

 mean_c1_0=np.mean(train[:: , 0])
 variance_c1_0=np.var(train[:: , 0])

This gets me mean and variance for column 0's ll the values.

I can always introduce one more for loop and couple of if conditions to keep checking last column and only then add corresponding values in column 0 but I am looking for an efficient approach. Since I am new to Python I was hoping if there is a numpy function that can get this done.

Can you point me to any such documentation ?

like image 683
R_Moose Avatar asked Jan 26 '23 18:01

R_Moose


2 Answers

You can use numpy's array filtering. (see How can I slice a numpy array by the value of the ith field?), and just get the mean that way. No loops needed.

import numpy
x = numpy.array([[23, 89, 4, 3, 0],[12, 73, 3, 5, 1],[7, 9, 12, 11, 0]])
numpy.mean(x[x[:,-1]==1][::,0])
numpy.mean(x[x[:,-1]==0][::,0])
like image 117
hoodakaushal Avatar answered Feb 08 '23 23:02

hoodakaushal


You can try this.

mean_of_zeros = np.mean(numpy_array[np.where(numpy_array[:,-1] == 0)])

mean_of_ones = np.mean(numpy_array[np.where(numpy_array[:,-1] == 1)])
like image 39
bumblebee Avatar answered Feb 09 '23 00:02

bumblebee