numpy - efficiently copy values from matrix to matrix using some precalculated map

Tags:

I have an input matrix A of size I*J

And an output matrix B of size N*M

And some precalculated map of size N*M*2 that dictates for each coordinate in B, which coordinate in A to take. The map has no specific rule or linearity that I can use. Just a map that seems random.

The matrices are pretty big (~5000*~3000) so creating a mapping matrix is out of the question (5000*3000*5000*3000)

I managed to do it using a simple map and loop:

for i in range(N):
    for j in range(M):
        B[i, j] = A[mapping[i, j, 0], mapping[i, j, 1]]

And I managed to do it using indexing:

B[coords_y, coords_x] = A[some_mapping[:, 0], some_mapping[:, 1]]
# Where coords_x, coords_y are defined as all of the coordinates:
# [[0,0],[0,1]..[0,M-1],[1,0],[1,1]...[N-1,M-1]]

This works much better, but still kind of slow.

I have infinite time in advance to calculate the mapping or any other utility calculation. But after these precalculations, this mapping should happen as fast as possible.

Currently, the only other option that I see is just to reimplement this in C or something faster...

(Just to make it clear if someone is curious, I'm creating an image out of some other, differently shaped and oriented image with some encoding. But its' mapping is very complicated and not something simple or linear that can be used)

473

asked Jul 08 '18 16:07

user972014

2 Answers

If you have infinity time for precomputing you can get a slight speedup by going to flat indexing:

map_f = np.ravel_multi_index((*np.moveaxis(mapping, 2, 0),), A.shape)

Then simply do:

A.ravel()[map_f]

Please note that this speedup is on top of the large speedup we get from fancy indexing. For example:

>>> A = np.random.random((5000, 3000))
>>> mapping = np.random.randint(0, 15000, (5000, 3000, 2)) % [5000, 3000]
>>> 
>>> map_f = np.ravel_multi_index((*np.moveaxis(mapping, 2, 0),), A.shape)
>>> 
>>> np.all(A.ravel()[map_f] == A[mapping[..., 0], mapping[..., 1]])
True
>>> 
>>> timeit('A[mapping[:, :, 0], mappping[:, :, 1]]', globals=globals(), number=10)
4.101239089999581
>>> timeit('A.ravel()[map_f]', globals=globals(), number=10)
2.7831342950012186

If we were to compare to the original loopy code, the speedup would be more like ~40x.

Finally, note that this solution does not only avoid the additional dependency and potential installation nightmare that is numba, but is also simpler, shorter and faster:

numba:
precomp:    132.957 ms
main        238.359 ms

flat indexing:
precomp:     76.223 ms
main:       219.910 ms

Code:

import numpy as np
from numba import jit

@jit
def fast(A, B, mapping):
    N, M = B.shape
    for i in range(N):
        for j in range(M):
            B[i, j] = A[mapping[i, j, 0], mapping[i, j, 1]]
    return B

from timeit import timeit

A = np.random.random((5000, 3000))
mapping = np.random.randint(0, 15000, (5000, 3000, 2)) % [5000, 3000]

a = np.random.random((5, 3))
m = np.random.randint(0, 15, (5, 3, 2)) % [5, 3]

print('numba:')
print(f"precomp: {timeit('b = fast(a, np.empty_like(a), m)', globals=globals(), number=1)*1000:10.3f} ms")
print(f"main     {timeit('B = fast(A, np.empty_like(A), mapping)', globals=globals(), number=10)*100:10.3f} ms")

print('\nflat indexing:')
print(f"precomp: {timeit('map_f = np.ravel_multi_index((*np.moveaxis(mapping, 2, 0),), A.shape)', globals=globals(), number=10)*100:10.3f} ms")
map_f = np.ravel_multi_index((*np.moveaxis(mapping, 2, 0),), A.shape)
print(f"main:    {timeit('B = A.ravel()[map_f]', globals=globals(), number=10)*100:10.3f} ms")

188

answered Oct 06 '22 19:10

Paul Panzer

One very nice solution to these types of performance critical problems is to keep it simple and utilize one of the high performance packages. The easiest might be Numba which provides the jit decorator that compiles array and loop heavy code to optimized LLVM. Below is a full example:

from time import time
import numpy as np
from numba import jit

# Function doing the computation
def normal(A, B, mapping):
    N, M = B.shape
    for i in range(N):
        for j in range(M):
            B[i, j] = A[mapping[i, j, 0], mapping[i, j, 1]]
    return B

# The same exact function, but with the Numba jit decorator
@jit
def fast(A, B, mapping):
    N, M = B.shape
    for i in range(N):
        for j in range(M):
            B[i, j] = A[mapping[i, j, 0], mapping[i, j, 1]]
    return B

# Create sample data
def create_sample_data(I, J, N, M):
    A = np.random.random((I, J))
    B = np.empty((N, M))
    mapping = np.asarray(np.stack((
        np.random.random((N, M))*I,
        np.random.random((N, M))*J,
        ), axis=2), dtype=int)
    return A, B, mapping
A, B, mapping = create_sample_data(500, 600, 700, 800)

# Run normally
t0 = time()
B = normal(A, B, mapping)
t1 = time()
print('normal took', t1 - t0, 'seconds')

# Run using Numba.
# First we should run the function with smaller arrays,
# just to compile the code.
fast(*create_sample_data(5, 6, 7, 8))
# Now, run with real data
t0 = time()
B = fast(A, B, mapping)
t1 = time()
print('fast took', t1 - t0, 'seconds')

This uses your own looping solution, which is inherently slow using standard Python, but as fast as C when using Numba. On my machine the normal function executes in 0.270 seconds, while the fast function executes in 0.00248 seconds. That is, Numba gives us a 109x speedup (!) pretty much for free.

Note that the fast Numba function is called twice, first with small input arrays and only then with the real data. This is a critical step which is often neglected. Without it, you will find that the performance increase is not nearly as good, as the first call is used to compile the code. The types and dimensions of the input arrays should be the same in this initial call, but the size in each dimension is not important.

I create B outside of the function(s) and passed it as an argument (to be "filled with values"). You might just as well allocate B inside of the function, Numba does not care.

The easiest way to get Numba is properly via the Anaconda distribution.

answered Oct 06 '22 19:10

jmd_dk

Related questions
                            
                                How to send OpenCV output to browser with python?
                            
                                RxPy: Sort hot observable between (slow) scan executions
                            
                                How to select range of rows in Pandas?
                            
                                Django python paypalrestsdk - No 'Access-Control-Allow-Origin' and ppxo_unhandled_error error
                            
                                python: how to add a new key and a value in yaml file
                            
                                Catching specific error messages in try / except
                            
                                pyspark.sql.types.Row to list
                            
                                deploy django application with pipenv on apache
                            
                                Keras: Method on_batch_end() is slow but I have no callbacks?
                            
                                Add a custom wheel file as a dependency in setup.py?
                            
                                django.db.utils.OperationalError: (1052, "Column 'name' in field list is ambiguous")
                            
                                GridsearchCV: can't pickle function error when trying to pass lambda in parameter
                            
                                Pandas backfill specific value
                            
                                aws eb cli Windows get version error on colorama
                            
                                Is there a better/more efficient way to do this (vectorised)? Very slow performance with Pandas apply
                            
                                How to install Python Development tools on MSYS2
                            
                                How to send a CTRL-C signal to individual threads in Python?
                            
                                TypeError: '<' not supported between instances of 'NoneType' and 'str'
                            
                                Skip converting entities while loading a yaml string (using PyYAML)
                            
                                Initializing numpy array from np.empty

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

numpy - efficiently copy values from matrix to matrix using some precalculated map

Tags:

python

optimization

python-3.x

matrix

numpy

user972014

People also ask

2 Answers

Paul Panzer

jmd_dk

Recent Activity

Donate For Us