Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Select rows based on a condition in numpy/python [duplicate]

I generate a random matrix of normal distribution and size 4x4; then I have to select rows whose sum is greater than 0.

When I write the code using 2D indexing, the output doesn't seem right:

a = np.random.randn(4, 4)
a[a[:, 0] > 0]

What I am missing?

like image 969
Andrewgorn Avatar asked Oct 23 '25 09:10

Andrewgorn


1 Answers

a = np.random.randn(4, 4)
print(a)

which in this case gives:

[[-0.73576686 -0.34940161 -0.87025271 -0.61287421]
 [ 1.2738813  -0.3855836  -1.55570664  0.43841268]
 [-1.63614248  1.4127681   0.37276815 -0.35188628]
 [ 0.18570751 -0.31197874 -2.05487768 -0.05619158]]

and then apply the condition:

a[np.sum(a, axis = 0)>0,:]

which here results in:

[[ 1.2738813 , -0.3855836 , -1.55570664,  0.43841268]]

Edit: For a bit of explanation, np.sum(a, axis = 0)>0 creates a 1D Boolean mask. We then apply this to the rows of a using index slicing as a[np.sum(a, axis = 0)>0,:].

like image 152
havingaball Avatar answered Oct 27 '25 02:10

havingaball