I have a 2D array which looks like this:
array = [[23 ,89, 4, 3, 0],[12, 73 ,3, 5,1],[7, 9 ,12, 11 ,0]]
Where the last column is always 0 or 1 for all the rows. My aim is to calculate two means for column 0, where one mean will be when the last column's value is 0 and one of the mean will be when last column's value will be 1.
e.g. for given sample array above: mean 1: 15 (mean for 0 column for all the rows where last column is 0) mean 2: 12 (mean for 0 column for all the rows where last column is 1)
I have tried this (where train is my input array's name):
mean_c1_0=np.mean(train[:: , 0])
variance_c1_0=np.var(train[:: , 0])
This gets me mean and variance for column 0's ll the values.
I can always introduce one more for loop and couple of if conditions to keep checking last column and only then add corresponding values in column 0 but I am looking for an efficient approach. Since I am new to Python I was hoping if there is a numpy function that can get this done.
Can you point me to any such documentation ?
You can use numpy's array filtering. (see How can I slice a numpy array by the value of the ith field?), and just get the mean that way. No loops needed.
import numpy
x = numpy.array([[23, 89, 4, 3, 0],[12, 73, 3, 5, 1],[7, 9, 12, 11, 0]])
numpy.mean(x[x[:,-1]==1][::,0])
numpy.mean(x[x[:,-1]==0][::,0])
You can try this.
mean_of_zeros = np.mean(numpy_array[np.where(numpy_array[:,-1] == 0)])
mean_of_ones = np.mean(numpy_array[np.where(numpy_array[:,-1] == 1)])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With