Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is creating a masked numpy array so slow with mask=None or mask=0

Today I profiled a function and I found a (at least to me) weird bottleneck: Creating a masked array with mask=None or mask=0 to initialize a mask with all zeros but the same shape as the data is very slow:

>>> import numpy as np
>>> data = np.ones((100, 100, 100))

>>> %timeit ma_array = np.ma.array(data, mask=None, copy=False)
1 loop, best of 3: 803 ms per loop

>>> %timeit ma_array = np.ma.array(data, mask=0, copy=False)
1 loop, best of 3: 807 ms per loop

on the other hand using mask=False or creating the mask by hand is much faster:

>>> %timeit ma_array = np.ma.array(data, mask=False, copy=False)
1000 loops, best of 3: 438 µs per loop

>>> %timeit ma_array = np.ma.array(data, mask=np.zeros(data.shape, dtype=bool), copy=False)
1000 loops, best of 3: 453 µs per loop

Why is giving None or 0 almost 2000 times slower than False or np.zeros(data.shape) as mask parameter? Given that the function docs only says that it:

Must be convertible to an array of booleans with the same shape as data. True indicates a masked (i.e. invalid) data.

I use python 3.5, numpy 1.11.0 on Windows 10

like image 643
MSeifert Avatar asked May 26 '16 18:05

MSeifert


People also ask

Is NumPy array slow?

The reason why NumPy is fast when used right is that its arrays are extremely efficient. They are like C arrays instead of Python lists.

What is a masked NumPy array?

A masked array is the combination of a standard numpy. ndarray and a mask. A mask is either nomask , indicating that no value of the associated array is invalid, or an array of booleans that determines for each element of the associated array whether the value is valid or not.

Why NumPy array operations are faster than looping?

Looping over Python arrays, lists, or dictionaries, can be slow. Thus, vectorized operations in Numpy are mapped to highly optimized C code, making them much faster than their standard Python counterparts.

Is NumPy delete slow?

According to the documentation for numpy. delete, the function returns a copy of the input array with the specified elements removed. So the larger the array you're copying, the slower the function will be.


2 Answers

mask=False is special-cased in the NumPy 1.11.0 source code:

if mask is True and mdtype == MaskType:
    mask = np.ones(_data.shape, dtype=mdtype)
elif mask is False and mdtype == MaskType:
    mask = np.zeros(_data.shape, dtype=mdtype)

mask=0 or mask=None take the slow path, making a 0-dimensional mask array and going through np.resize to resize it.

like image 52
user2357112 supports Monica Avatar answered Sep 20 '22 11:09

user2357112 supports Monica


I believe @user2357112 has the explanation. I profiled both cases, here are the results:

In [14]: q.run('q.np.ma.array(q.data, mask=None, copy=False)')
         49 function calls in 0.161 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        3    0.000    0.000    0.000    0.000 :0(array)
        1    0.154    0.154    0.154    0.154 :0(concatenate)
        1    0.000    0.000    0.161    0.161 :0(exec)
       11    0.000    0.000    0.000    0.000 :0(getattr)
        1    0.000    0.000    0.000    0.000 :0(hasattr)
        7    0.000    0.000    0.000    0.000 :0(isinstance)
        1    0.000    0.000    0.000    0.000 :0(len)
        1    0.000    0.000    0.000    0.000 :0(ravel)
        1    0.000    0.000    0.000    0.000 :0(reduce)
        1    0.000    0.000    0.000    0.000 :0(reshape)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
        5    0.000    0.000    0.000    0.000 :0(update)
        1    0.000    0.000    0.161    0.161 <string>:1(<module>)
        1    0.000    0.000    0.161    0.161 core.py:2704(__new__)
        1    0.000    0.000    0.000    0.000 core.py:2838(_update_from)
        1    0.000    0.000    0.000    0.000 core.py:2864(__array_finalize__)
        5    0.000    0.000    0.000    0.000 core.py:3264(__setattr__)
        1    0.000    0.000    0.161    0.161 core.py:6119(array)
        1    0.007    0.007    0.161    0.161 fromnumeric.py:1097(resize)
        1    0.000    0.000    0.000    0.000 fromnumeric.py:128(reshape)
        1    0.000    0.000    0.000    0.000 fromnumeric.py:1383(ravel)
        1    0.000    0.000    0.000    0.000 numeric.py:484(asanyarray)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.161    0.161 profile:0(q.np.ma.array(q.data, mask=None, copy=False))

In [15]: q.run('q.np.ma.array(q.data, mask=False, copy=False)')
         37 function calls in 0.000 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000    0.000    0.000 :0(array)
        1    0.000    0.000    0.000    0.000 :0(exec)
       11    0.000    0.000    0.000    0.000 :0(getattr)
        1    0.000    0.000    0.000    0.000 :0(hasattr)
        5    0.000    0.000    0.000    0.000 :0(isinstance)
        1    0.000    0.000    0.000    0.000 :0(setprofile)
        5    0.000    0.000    0.000    0.000 :0(update)
        1    0.000    0.000    0.000    0.000 :0(zeros)
        1    0.000    0.000    0.000    0.000 <string>:1(<module>)
        1    0.000    0.000    0.000    0.000 core.py:2704(__new__)
        1    0.000    0.000    0.000    0.000 core.py:2838(_update_from)
        1    0.000    0.000    0.000    0.000 core.py:2864(__array_finalize__)
        5    0.000    0.000    0.000    0.000 core.py:3264(__setattr__)
        1    0.000    0.000    0.000    0.000 core.py:6119(array)
        0    0.000             0.000          profile:0(profiler)
        1    0.000    0.000    0.000    0.000 profile:0(q.np.ma.array(q.data, mask=False, copy=False))

So it seems that the concatenation step of arrays is the bottleneck.

like image 41
hilberts_drinking_problem Avatar answered Sep 22 '22 11:09

hilberts_drinking_problem