Python Numpy nonzero

Question

So I've got this numpy array of shape (31641600,2), which has some, if not many zero values in it.

Let's call the array X.

Doing:

print len(X)
>>> 31641600

But then doing:

X = X[np.nonzero(X)]
print len(X)
>>> 31919809

Don't understand why the second one is bigger. On the Documentation it says that applying the above method should return only the non-zero values, hence the length of X should be smaller.

Any ideas? Thank you.

Lev Levitsky · Accepted Answer

This may be due to the fact that len(X) only returns X's length along the first axis. When you do

X = X[np.nonzero(X)]

you get a 1D array, so if you had less than 50% of zeros in X, len(X) will increase.

Consider:

In [1]: import numpy as np

In [2]: X = np.zeros((42, 2))

In [3]: X[:, 0] = 1

In [4]: X[0, 1] = 1

In [5]: len(X)
Out[5]: 42

In [6]: len(X[np.nonzero(X)])
Out[6]: 43

That's because X[np.nonzero(X)] is an array of 43 one's:

In [7]: X[np.nonzero(X)].shape
Out[7]: (43,)

Update in response to comment: if in fact you want all pairs where the first element is non-zero, you can do:

X = X[ X[:, 0] != 0 ]

Python Numpy nonzero

Tags:

python

arrays

numpy

Claudiu S

1 Answers

Lev Levitsky

Recent Activity

Donate For Us

Python Numpy nonzero

Tags:

python

arrays

numpy

Claudiu S

1 Answers

Lev Levitsky

Related questions

Recent Activity

Donate For Us