Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a numpy/scipy dot product, calculating only the diagonal entries of the result?

Imagine having 2 numpy arrays:

> A, A.shape = (n,p) > B, B.shape = (p,p) 

Typically p is a smaller number (p <= 200), while n can be arbitrarily large.

I am doing the following:

result = np.diag(A.dot(B).dot(A.T)) 

As you can see, I am keeping only the n diagonal entries, however there is an intermediate (n x n) array calculated from which only the diagonal entries are kept.

I wish for a function like diag_dot(), which only calculates the diagonal entries of the result and does not allocate the complete memory.

A result would be:

> result = diag_dot(A.dot(B), A.T) 

Is there a premade functionality like this and can this be done efficiently without the need for allocating the intermediate (n x n) array?

like image 899
user2051916 Avatar asked Feb 07 '13 18:02

user2051916


People also ask

What does NumPy diagonal do?

diag() function. The diag() function is used to extract a diagonal or construct a diagonal array. If v is a 2-D array, return a copy of its k-th diagonal.

How to calculate dot products for NumPy array?

Steps to calculate dot products for Numpy Array 1 Import all the necessary libraries. Here in this tutorial, I am using only the NumPy array. Let’s import them. import numpy as np 2 Create a Numpy array Let’s create both the one dimensional and two- dimensional NumPy array to perform dot product on it. ... 3 Calculate Numpy dot product of Array

How to find the dot product of two sequences in Python?

Python code to find the dot product. Python provides an efficient way to find the dot product of two sequences which is numpy.dot() method of numpy library. Numpy.dot() is a method that takes the two sequences as arguments, whether it be vectors or multidimensional arrays, and prints the result i.e., dot product.

What does N and p mean in NumPy?

Here n is the number of columns of the matrix or array1 and p is the number of rows of the matrix or array 2. The above examples were calculating products using the same 1D and 2D Numpy array.

How to pass two arrays to dot product in Python?

To do so you have to pass two arrays inside the dot () method. Let’s see them You have to just pass both 1D NumPy arrays inside the dot () method. It will return a single result. Here you have to be careful. It is a 2D array and you have to follow rules on dot product.


2 Answers

I think i got it on my own, but nevertheless will share the solution:

since getting only the diagonals of a matrix multiplication

> Z = N.diag(X.dot(Y)) 

is equivalent to the individual sum of the scalar product of rows of X and columns of Y, the previous statement is equivalent to:

> Z = (X * Y.T).sum(-1) 

For the original variables this means:

> result = (A.dot(B) * A).sum(-1) 

Please correct me if I am wrong but this should be it ...

like image 81
user2051916 Avatar answered Sep 21 '22 19:09

user2051916


You can get almost anything you ever dreamed of with numpy.einsum. Until you start getting the hang of it, it basically seems like black voodoo...

>>> a = np.arange(15).reshape(5, 3) >>> b = np.arange(9).reshape(3, 3)  >>> np.diag(np.dot(np.dot(a, b), a.T)) array([  60,  672, 1932, 3840, 6396]) >>> np.einsum('ij,ji->i', np.dot(a, b), a.T) array([  60,  672, 1932, 3840, 6396]) >>> np.einsum('ij,ij->i', np.dot(a, b), a) array([  60,  672, 1932, 3840, 6396]) 

EDIT You can actually get the whole thing in a single shot, it's ridiculous...

>>> np.einsum('ij,jk,ki->i', a, b, a.T) array([  60,  672, 1932, 3840, 6396]) >>> np.einsum('ij,jk,ik->i', a, b, a) array([  60,  672, 1932, 3840, 6396]) 

EDIT You don't want to let it figure too much on its own though... Added the OP's answer to its own question for comparison also.

n, p = 10000, 200 a = np.random.rand(n, p) b = np.random.rand(p, p)  In [2]: %timeit np.einsum('ij,jk,ki->i', a, b, a.T) 1 loops, best of 3: 1.3 s per loop  In [3]: %timeit np.einsum('ij,ij->i', np.dot(a, b), a) 10 loops, best of 3: 105 ms per loop  In [4]: %timeit np.diag(np.dot(np.dot(a, b), a.T)) 1 loops, best of 3: 5.73 s per loop  In [5]: %timeit (a.dot(b) * a).sum(-1) 10 loops, best of 3: 115 ms per loop 
like image 23
Jaime Avatar answered Sep 18 '22 19:09

Jaime