Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python: Combined masking in numpy

Tags:

python

nan

numpy

In a numpy array I want to replace all nan and inf into a fixed number. Can I do that in one step to save computation time (arrays are really big)?

a = np.arange(10.0)
a[3] = np.nan
a[5] = np.inf
a[7] = -np.inf
# a: [  0.   1.   2.  nan   4.  inf   6. -inf   8.   9.]

a[np.isnan(a)] = -999
a[np.isinf(a)] = -999
# a: [  0.   1.   2.  -999.   4.  -999.   6. -999.   8.   9.]

The code above works fine. But I am looking for something like:

a[np.isnan(a) or np.isinf(a)] = -999

Which does not work and I can see why. Just thinking it might be better if every item of a is only checked once.

like image 334
offeltoffel Avatar asked Dec 11 '25 17:12

offeltoffel


1 Answers

Numpy comes with its own vectorized version of or:

a[np.logical_or(np.isnan(a), np.isinf(a))] = -999

While the above version is clear understanable, there is a faster one, which is a bit weird:

a[np.isnan(a-a)] = -9999

The idea behind this is, that 'np.inf-np.inf = np.nan`

%timeit a[np.isnan(a-a)] = -999
# 100000 loops, best of 3: 11.7 µs per loop
%timeit a[np.isnan(a) | np.isinf(a)] = -999
# 10000 loops, best of 3: 51.4 µs per loop
%timeit a[np.logical_or(np.isnan(a), np.isinf(a))] = -999
# 10000 loops, best of 3: 51.4 µs per loop

Hence the | and np.logical_or version seem to be internally equivalent

like image 135
Jürg Merlin Spaak Avatar answered Dec 13 '25 07:12

Jürg Merlin Spaak



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!