Vectorized way of calculating row-wise dot product two matrices with Scipy

Tags:

I want to calculate the row-wise dot product of two matrices of the same dimension as fast as possible. This is the way I am doing it:

import numpy as np
a = np.array([[1,2,3], [3,4,5]])
b = np.array([[1,2,3], [1,2,3]])
result = np.array([])
for row1, row2 in a, b:
    result = np.append(result, np.dot(row1, row2))
print result

and of course the output is:

[ 26.  14.]

485

asked Mar 25 '13 13:03

3 Answers

Straightforward way to do that is:

import numpy as np
a=np.array([[1,2,3],[3,4,5]])
b=np.array([[1,2,3],[1,2,3]])
np.sum(a*b, axis=1)

which avoids the python loop and is faster in cases like:

def npsumdot(x, y):
    return np.sum(x*y, axis=1)

def loopdot(x, y):
    result = np.empty((x.shape[0]))
    for i in range(x.shape[0]):
        result[i] = np.dot(x[i], y[i])
    return result

timeit npsumdot(np.random.rand(500000,50),np.random.rand(500000,50))
# 1 loops, best of 3: 861 ms per loop
timeit loopdot(np.random.rand(500000,50),np.random.rand(500000,50))
# 1 loops, best of 3: 1.58 s per loop

168

answered Oct 19 '22 06:10

sim

Check out numpy.einsum for another method:

In [52]: a
Out[52]: 
array([[1, 2, 3],
       [3, 4, 5]])

In [53]: b
Out[53]: 
array([[1, 2, 3],
       [1, 2, 3]])

In [54]: einsum('ij,ij->i', a, b)
Out[54]: array([14, 26])

Looks like einsum is a bit faster than inner1d:

In [94]: %timeit inner1d(a,b)
1000000 loops, best of 3: 1.8 us per loop

In [95]: %timeit einsum('ij,ij->i', a, b)
1000000 loops, best of 3: 1.6 us per loop

In [96]: a = random.randn(10, 100)

In [97]: b = random.randn(10, 100)

In [98]: %timeit inner1d(a,b)
100000 loops, best of 3: 2.89 us per loop

In [99]: %timeit einsum('ij,ij->i', a, b)
100000 loops, best of 3: 2.03 us per loop

Note: NumPy is constantly evolving and improving; the relative performance of the functions shown above has probably changed over the years. If performance is important to you, run your own tests with the version of NumPy that you will be using.

answered Oct 19 '22 06:10

Warren Weckesser

Played around with this and found inner1d the fastest. That function however is internal, so a more robust approach is to use

numpy.einsum("ij,ij->i", a, b)

Even better is to align your memory such that the summation happens in the first dimension, e.g.,

a = numpy.random.rand(3, n)
b = numpy.random.rand(3, n)

numpy.einsum("ij,ij->j", a, b)

For 10 ** 3 <= n <= 10 ** 6, this is the fastest method, and up to twice as fast as its untransposed equivalent. The maximum occurs when the level-2 cache is maxed out, at about 2 * 10 ** 4.

Note also that the transposed summation is much faster than its untransposed equivalent.

enter image description here

The plot was created with perfplot (a small project of mine)

import numpy
from numpy.core.umath_tests import inner1d
import perfplot


def setup(n):
    a = numpy.random.rand(n, 3)
    b = numpy.random.rand(n, 3)
    aT = numpy.ascontiguousarray(a.T)
    bT = numpy.ascontiguousarray(b.T)
    return (a, b), (aT, bT)


b = perfplot.bench(
    setup=setup,
    n_range=[2 ** k for k in range(1, 25)],
    kernels=[
        lambda data: numpy.sum(data[0][0] * data[0][1], axis=1),
        lambda data: numpy.einsum("ij, ij->i", data[0][0], data[0][1]),
        lambda data: numpy.sum(data[1][0] * data[1][1], axis=0),
        lambda data: numpy.einsum("ij, ij->j", data[1][0], data[1][1]),
        lambda data: inner1d(data[0][0], data[0][1]),
    ],
    labels=["sum", "einsum", "sum.T", "einsum.T", "inner1d"],
    xlabel="len(a), len(b)",
)

b.save("out1.png")
b.save("out2.png", relative_to=3)

answered Oct 19 '22 06:10

Nico Schlömer

Related questions
                            
                                Is there a Python equivalent of range(n) for multidimensional ranges?
                            
                                Pandas in AWS lambda gives numpy error
                            
                                Numpy integer nan [duplicate]
                            
                                Python: slicing a multi-dimensional array
                            
                                Suppressing scientific notation in pandas?
                            
                                Numpy minimum in (row, column) format
                            
                                Cannot install numpy from wheel format
                            
                                numpy.unique with order preserved
                            
                                1D numpy concatenate: TypeError: only integer scalar arrays can be converted to a scalar index [duplicate]
                            
                                Conditionally fill column values based on another columns value in pandas
                            
                                Normalize numpy array columns in python
                            
                                find length of sequences of identical values in a numpy array (run length encoding)
                            
                                SQL-like window functions in PANDAS: Row Numbering in Python Pandas Dataframe
                            
                                TypeError: unhashable type: 'numpy.ndarray'
                            
                                How to extend an array in-place in Numpy?
                            
                                check if numpy array is multidimensional or not
                            
                                Extracting first n columns of a numpy matrix
                            
                                How to understand numpy strides for layman?
                            
                                What's the simplest way to extend a numpy array in 2 dimensions?
                            
                                Numpy: For every element in one array, find the index in another array

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Vectorized way of calculating row-wise dot product two matrices with Scipy

Tags:

vectorization

numpy

scipy

matrix-multiplication

dot-product

Cupitor

People also ask

3 Answers

sim

Warren Weckesser

Nico Schlömer

Recent Activity

Donate For Us