Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum over squared array

As part of a batch Euclidean distance computation, I'm computing

(X * X).sum(axis=1)

where X is a rather large 2-d array. This works fine, but it constructs a temporary array of the same size as X. Is there any way to get rid of this temporary, but retain the efficiency of a vectorized operation?

The obvious candidate,

np.array([np.dot(row, row) for row in X])

works, but uses a Python list as a temporary, making it rather slow.

Without the axis, the memory-efficient form would be

(X * X).sum()  =>  np.dot(X.ravel(), X.ravel())

and I know that, when axis=1, it's equivalent to

np.diag(np.dot(X, X.T))

which got me looking into generalizations of dot such as np.inner, np.tensordot and np.einsum, but I can't figure out how they would solve my problem.

like image 735
Fred Foo Avatar asked Sep 30 '13 12:09

Fred Foo


People also ask

What is the sum of squared differences?

The sum of squares is a statistical measure of deviation from the mean. It is also known as variation. It is calculated by adding together the squared differences of each data point. To determine the sum of squares, square the distance between each data point and the line of best fit, then add them together.

How do you sum parts of an array?

Approach to Find the Sum of All Elements in an ArrayInitialize a variable sum to store the total sum of all elements of the array. Traverse the array and add each element of the array with the sum variable. Finally, return the sum variable.

How do you do sum of squares in Matlab?

[ s , n ] = sumsqr( x ) takes a matrix or cell array of matrices, x , and returns the sum, s , of all squared finite values in x , and the number of finite values, n . If x does not contain finite values, the sum returned is 0.


1 Answers

The einsum equivalent is:

np.einsum('ij,ij->i', X, X)

Though I am not sure how this works internally so it may or may not not solve your problem.

like image 161
YXD Avatar answered Oct 02 '22 18:10

YXD