Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do Python/Numpy require a row vector for matrix/vector dot product?

Assume we want to compute the dot product of a matrix and a column vector:

Matrix dot vector

So in Numpy/Python here we go:

a=numpy.asarray([[1,2,3], [4,5,6], [7,8,9]])
b=numpy.asarray([[2],[1],[3]])
a.dot(b)

Results in:

array([[13], [31], [49]])

So far, so good, however why is this also working?

b=numpy.asarray([2,1,3])
a.dot(b)

Results in:

array([13, 31, 49])

I would expect that [2,1,3] is a row vector (which requires a transpose to apply the dot product), but Numpy seems to see arrays by default as column vectors (in case of matrix multiplication)?

How does this work?

EDIT:

And why is:

b=numpy.asarray([2,1,3])
b.transpose()==b

So the matrix dot vector array does work (so then it sees it as a column vector), however other operations (transpose) does not work. This is not really consistent design isn't it?

like image 744
robert Avatar asked Jan 05 '16 08:01

robert


People also ask

Are NumPy arrays row or column vectors?

NumPy apes the concept of row and column vectors using 2-dimensional arrays. An array of shape (5,1) has 5 rows and 1 column. You can sort of think of this as a column vector, and wherever you would need a column vector in linear algebra, you could use an array of shape (n,1) .

How does dot product work in NumPy?

If one input is a scalar and one is an array, np. dot() will multiply every value of the array by the scalar (i.e., scalar multiplication). If both inputs are 1-dimensional arrays, np. dot() will compute the dot product of the inputs.

Should I use Matmul or dot?

However, as we said before, it is recommended to use np. dot for dot product and np. matmul for 2D or higher matrix multiplication.


1 Answers

Let's first understand how the dot operation is defined in numpy.

(Leaving broadcasting rules out of the discussion, for simplicity) you can perform dot(A,B) if the last dimension of A (i.e. A.shape[-1]) is the same as the next-to-last dimension of B (i.e. B.shape[-2]) if B.ndim>=2, and simply the dimension of B if B.ndim==1.

In other words, if A.shape=(N1,...,Nk,X) and B.shape=(M1,...,M(j-1),X,Mj) (note the common X). The resulting array will have the shape (N1,...,Nk,M1,...,Mj) (note that X was dropped).

Or, if A.shape=(N1,...,Nk,X) and B.shape=(X,). The resulting array will have the shape (N1,...,Nk) (note that X was dropped).

Your examples work because they satisfy the rules (the first example satisfies the first, the second satisfies the second):

a=numpy.asarray([[1,2,3], [4,5,6], [7,8,9]])
b=numpy.asarray([[2],[1],[3]])
a.shape, b.shape, '->', a.dot(b).shape  # X=3
=> ((3, 3), (3, 1), '->', (3, 1))

b=numpy.asarray([2,1,3])
a.shape, b.shape, '->', a.dot(b).shape  # X=3
=> ((3, 3), (3,), '->', (3,))

My recommendation is that, when using numpy, don't think in terms of "row/column vectors", and if possible don't think in terms of "vectors" at all, but in terms of "an array with shape S". This means that both row vectors and column vectors are simply "1dim arrays". As far as numpy is concerned, they are one and the same.

This should also make it clear why in your case b.transponse() is the same as b. b being a 1dim array, when transposed, remains a 1dim array. Transpose doesn't affect 1dim arrays.

like image 152
shx2 Avatar answered Sep 21 '22 21:09

shx2