Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between these array shapes in numpy

What is the difference between 2 arrays whose shapes are-

(442,1) and (442,) ?

Printing both of these produces an identical output, but when I check for equality ==, I get a 2D vector like this-

array([[ True, False, False, ..., False, False, False],
       [False,  True, False, ..., False, False, False],
       [False, False,  True, ..., False, False, False],
       ..., 
       [False, False, False, ...,  True, False, False],
       [False, False, False, ..., False,  True, False],
       [False, False, False, ..., False, False,  True]], dtype=bool)

Can someone explain the difference?

like image 588
goelakash Avatar asked Dec 19 '14 17:12

goelakash


People also ask

What does the shape of a numpy array mean?

The function "shape" returns the shape of an array. The shape is a tuple of integers. These numbers denote the lengths of the corresponding array dimension. In other words: The "shape" of an array is a tuple with the number of elements per axis (dimension).

How do you find the difference between numpy arrays?

diff(arr[, n[, axis]]) function is used when we calculate the n-th order discrete difference along the given axis. The first order difference is given by out[i] = arr[i+1] – arr[i] along the given axis. If we have to calculate higher differences, we are using diff recursively.

What is difference between shape and size in numpy?

Shape (in the numpy context) seems to me the better option for an argument name. The actual relation between the two is size = np. prod(shape) so the distinction should indeed be a bit more obvious in the arguments names. randint uses the size parameter name, but uses shape in the explanation.

What is the difference between shape and reshape in numpy?

The shape tool gives a tuple of array dimensions and can be used to change the dimensions of an array. The reshape tool gives a new shape to an array without changing its data. It creates a new array and does not modify the original array itself.


1 Answers

An array of shape (442, 1) is 2-dimensional. It has 442 rows and 1 column.

An array of shape (442, ) is 1-dimensional and consists of 442 elements.

Note that their reprs should look different too. There is a difference in the number and placement of parenthesis:

In [7]: np.array([1,2,3]).shape
Out[7]: (3,)

In [8]: np.array([[1],[2],[3]]).shape
Out[8]: (3, 1)

Note that you could use np.squeeze to remove axes of length 1:

In [13]: np.squeeze(np.array([[1],[2],[3]])).shape
Out[13]: (3,)

NumPy broadcasting rules allow new axes to be automatically added on the left when needed. So (442,) can broadcast to (1, 442). And axes of length 1 can broadcast to any length. So when you test for equality between an array of shape (442, 1) and an array of shape (442, ), the second array gets promoted to shape (1, 442) and then the two arrays expand their axes of length 1 so that they both become broadcasted arrays of shape (442, 442). This is why when you tested for equality the result was a boolean array of shape (442, 442).

In [15]: np.array([1,2,3]) == np.array([[1],[2],[3]])
Out[15]: 
array([[ True, False, False],
       [False,  True, False],
       [False, False,  True]], dtype=bool)

In [16]: np.array([1,2,3]) == np.squeeze(np.array([[1],[2],[3]]))
Out[16]: array([ True,  True,  True], dtype=bool)
like image 165
unutbu Avatar answered Oct 19 '22 08:10

unutbu