Python vectorizing nested for loops

Tags:

I'd appreciate some help in finding and understanding a pythonic way to optimize the following array manipulations in nested for loops:

Click to copy

def _func(a, b, radius):     "Return 0 if a>b, otherwise return 1"     if distance.euclidean(a, b) < radius:         return 1     else:         return 0  def _make_mask(volume, roi, radius):     mask = numpy.zeros(volume.shape)     for x in range(volume.shape[0]):         for y in range(volume.shape[1]):             for z in range(volume.shape[2]):                 mask[x, y, z] = _func((x, y, z), roi, radius)     return mask

Where volume.shape (182, 218, 200) and roi.shape (3,) are both ndarray types; and radius is an int

763

asked Sep 23 '16 18:09

Kambiz

2 Answers

Approach #1

Here's a vectorized approach -

Click to copy

m,n,r = volume.shape x,y,z = np.mgrid[0:m,0:n,0:r] X = x - roi[0] Y = y - roi[1] Z = z - roi[2] mask = X**2 + Y**2 + Z**2 < radius**2

Possible improvement : We can probably speedup the last step with numexpr module -

Click to copy

import numexpr as ne  mask = ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')

Approach #2

We can also gradually build the three ranges corresponding to the shape parameters and perform the subtraction against the three elements of roi on the fly without actually creating the meshes as done earlier with np.mgrid. This would be benefited by the use of broadcasting for efficiency purposes. The implementation would look like this -

Click to copy

m,n,r = volume.shape vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \        ((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2) mask = vals < radius**2

Simplified version : Thanks to @Bi Rico for suggesting an improvement here as we can use np.ogrid to perform those operations in a bit more concise manner, like so -

Click to copy

m,n,r = volume.shape     x,y,z = np.ogrid[0:m,0:n,0:r]-roi mask = (x**2+y**2+z**2) < radius**2

Runtime test

Function definitions -

Click to copy

def vectorized_app1(volume, roi, radius):     m,n,r = volume.shape     x,y,z = np.mgrid[0:m,0:n,0:r]     X = x - roi[0]     Y = y - roi[1]     Z = z - roi[2]     return X**2 + Y**2 + Z**2 < radius**2  def vectorized_app1_improved(volume, roi, radius):     m,n,r = volume.shape     x,y,z = np.mgrid[0:m,0:n,0:r]     X = x - roi[0]     Y = y - roi[1]     Z = z - roi[2]     return ne.evaluate('X**2 + Y**2 + Z**2 < radius**2')  def vectorized_app2(volume, roi, radius):     m,n,r = volume.shape     vals = ((np.arange(m)-roi[0])**2)[:,None,None] + \            ((np.arange(n)-roi[1])**2)[:,None] + ((np.arange(r)-roi[2])**2)     return vals < radius**2  def vectorized_app2_simplified(volume, roi, radius):     m,n,r = volume.shape         x,y,z = np.ogrid[0:m,0:n,0:r]-roi     return (x**2+y**2+z**2) < radius**2

Timings -

Click to copy

In [106]: # Setup input arrays        ...: volume = np.random.rand(90,110,100) # Half of original input sizes       ...: roi = np.random.rand(3)      ...: radius = 3.4      ...:   In [107]: %timeit _make_mask(volume, roi, radius) 1 loops, best of 3: 41.4 s per loop  In [108]: %timeit vectorized_app1(volume, roi, radius) 10 loops, best of 3: 62.3 ms per loop  In [109]: %timeit vectorized_app1_improved(volume, roi, radius) 10 loops, best of 3: 47 ms per loop  In [110]: %timeit vectorized_app2(volume, roi, radius) 100 loops, best of 3: 4.26 ms per loop  In [139]: %timeit vectorized_app2_simplified(volume, roi, radius) 100 loops, best of 3: 4.36 ms per loop

So, as always broadcasting showing its magic for a crazy almost 10,000x speedup over the original code and more than 10x better than creating meshes by using on-the-fly broadcasted operations!

115

answered Sep 19 '22 02:09

Divakar

Say you first build an xyzy array:

Click to copy

import itertools  xyz = [np.array(p) for p in itertools.product(range(volume.shape[0]), range(volume.shape[1]), range(volume.shape[2]))]

Now, using numpy.linalg.norm,

Click to copy

np.linalg.norm(xyz - roi, axis=1) < radius

checks whether the distance for each tuple from roi is smaller than radius.

Finally, just reshape the result to the dimensions you need.

answered Sep 20 '22 02:09

Ami Tavory

Related questions
                            
                                what's Python asyncio.Lock() for?
                            
                                Improve Row Append Performance On Pandas DataFrames
                            
                                Convert a numpy.ndarray to string(or bytes) and convert it back to numpy.ndarray
                            
                                Instantiating a python class in C#
                            
                                accurately measure time python function takes
                            
                                whitespaces in the path of windows filepath
                            
                                Find python lxml version
                            
                                NumPy: Comparing Elements in Two Arrays
                            
                                Get all table names in a Django app
                            
                                Obtain Active window using Python
                            
                                "Converting" Numpy arrays to Matlab and vice versa
                            
                                Changing logging's 'basicConfig' which is already set
                            
                                Python Matplotlib Venn diagram
                            
                                The `is` operator behaves unexpectedly with non-cached integers
                            
                                Python: Opening a folder in Explorer/Nautilus/Finder
                            
                                python: rstrip one exact string, respecting order
                            
                                Python logging file config KeyError: 'formatters'
                            
                                obtaining last value of dataframe column without index
                            
                                Error message "python-pylint 'C0103:Invalid constant name"
                            
                                How to use Python decorators to check function arguments?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python vectorizing nested for loops

Tags:

python

optimization

for-loop

vectorization

numpy

Kambiz

People also ask

2 Answers

Divakar

Ami Tavory

Recent Activity

Donate For Us