Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does numpy Dot product of 2d array with 1d array produce 1d array?

Tags:

python

numpy

I try to run the code like below:

>>> import numpy as np
>>> A = np.array([[1,2], [3,4], [5,6]])
>>> A.shape
(3, 2)
>>> B = np.array([7,8])
>>> B.shape
(2,)
>>> np.dot(A,B)
array([23, 53, 83])

I thought the shape of np.dot(A,B) should be (1,3) not (3,).

The result of matrix return should be:

array([[23],[53],[83]])

23
53
83

not

array([23,53,83])

23 53 83

why the result occurred?

like image 788
Champer Wu Avatar asked Dec 10 '22 03:12

Champer Wu


1 Answers

As its name suggests, the primary purpose of the numpy.dot() function is to deliver a scalar result by performing a traditional linear algebra dot product on two arrays of identical shape (m,).

Given this primary purpose, the documentation of numpy.dot() also talks about this scenario as the first (the first bullet point below):

numpy.dot(a, b, out=None)

 1. If both a and b are 1-D arrays, it is inner product of vectors (without complex conjugation).
 2. If both a and b are 2-D arrays, it is matrix multiplication, but using matmul or a @ b is preferred.
 3. If either a or b is 0-D (scalar), it is equivalent to multiply and using numpy.multiply(a, b) or a * b is preferred.
 4. If a is an N-D array and b is a 1-D array, it is a sum product over the last axis of a and b.

Your case is covered by the 4 th bullet point above (as pointed out by @hpaulj) in his comments. But then, it still does not fully answer your question as to why the result has shape (3,), and not (3,1) as you expected.

You are justified in expecting a result-shape of (3,1), only if shape of B is (2,1). In such a case, since A has shape (3,2), and B has shape (2,1), you'd be justified in expecting a result-shape of (3,1).

But here, B has a shape of (2,), and not (2,1). So, we are now in a territory that's outside the jurisdiction of the usual rules of matrix multiplication. So, it's really up to the designers of the numpy.dot() function as to how the result turns out to be. They could've chosen to treat this as an error ("dimension mis-match"). Instead, they've chosen to deal with this scenario, as described in this answer.

I'm quoting that answer, with some modifications to relate your code:

According to numpy a 1D array has only 1 dimension and all checks are done against that dimension. Because of this we find that np.dot(A,B) checks second dimension of A against the one dimension of B

So, the check would succeed, and numpy wouldn't treat this as an error.

Now, the only remaining question is why is the result-shape (3,) and not (3,1) or (1,3).

The answer to this is: in A, which has shape (3,2), we have consumed the last part (2,) to perform sum-product. The un-consumed part of A's shape is (3,), and hence the shape of the result of np.dot(A,B), would be (3,). To understand this further, if we take a different example in which A has a shape of (3,4,2), instead of (3,2), the un-consumed part of A's shape would be (3,4,), and the result of np.dot(A,B) would be (3,4,) instead of (3,) which your example produced.

Here's the code for you to verify:

import numpy as np

A = np.arange(24).reshape(3,4,2)
print ("A is:\n", A, ", and its shape is:", A.shape)
B = np.array([7,8])
print ("B is:\n", B, ", and its shape is:", B.shape)
C = np.dot(A,B)
print ("C is:\n", C, ", and its shape is:", C.shape)

The output of this is:

A is:
 [[[ 0  1]
  [ 2  3]
  [ 4  5]
  [ 6  7]]

 [[ 8  9]
  [10 11]
  [12 13]
  [14 15]]

 [[16 17]
  [18 19]
  [20 21]
  [22 23]]] , and its shape is: (3, 4, 2)
B is:
 [7 8] , and its shape is: (2,)
C is:
 [[  8  38  68  98]
 [128 158 188 218]
 [248 278 308 338]] , and its shape is: (3, 4)

Another helpful perspective to understand the behavior in this example is below:

The array A of shape (3,4,2) can be conceptually visualized as an outer array of inner arrays, where the outer array has shape (3,4), and each inner array has shape (2,). On each of these inner arrays, the traditional dot product will therefore be performed using the array B (which has shape (2,), and the resulting scalars are all left in their own respective places, to form a (3,4) shape (the outer matrix shape). So, the overall result of numpy.dot(A,B), consisting of all these in-place scalar results, would have the shape (3,4).

like image 64
fountainhead Avatar answered Dec 12 '22 15:12

fountainhead