What is the difference between 2 arrays whose shapes are- (442,1) and (442,) ? Printing both of these produces an identical output, but when I check for equality ==, I get a 2D vector like this- <pre class="prettyprint"><code>array([[ True, False, False, ..., False, False, False], [False, True, False, ..., False, False, False], [False, False, True, ..., False, False, False], ..., [False, False, False, ..., True, False, False], [False, False, False, ..., False, True, False], [False, False, False, ..., False, False, True]], dtype=bool) </code></pre> Can someone explain the difference?

An array of shape <code>(442, 1)</code> is 2-dimensional. It has 442 rows and 1 column. An array of shape <code>(442, )</code> is 1-dimensional and consists of 442 elements. Note that their reprs should look different too. There is a difference in the number and placement of parenthesis: <pre class="prettyprint"><code>In [7]: np.array([1,2,3]).shape Out[7]: (3,) In [8]: np.array([[1],[2],[3]]).shape Out[8]: (3, 1) </code></pre> <hr> Note that you could use <code>np.squeeze</code> to remove axes of length 1: <pre class="prettyprint"><code>In [13]: np.squeeze(np.array([[1],[2],[3]])).shape Out[13]: (3,) </code></pre> <hr> NumPy broadcasting rules allow new axes to be automatically added on the left when needed. So <code>(442,)</code> can broadcast to <code>(1, 442)</code>. And axes of length 1 can broadcast to any length. So when you test for equality between an array of shape <code>(442, 1)</code> and an array of shape <code>(442, )</code>, the second array gets promoted to shape <code>(1, 442)</code> and then the two arrays expand their axes of length 1 so that they both become broadcasted arrays of shape <code>(442, 442)</code>. This is why when you tested for equality the result was a boolean array of shape <code>(442, 442)</code>. <pre class="prettyprint"><code>In [15]: np.array([1,2,3]) == np.array([[1],[2],[3]]) Out[15]: array([[ True, False, False], [False, True, False], [False, False, True]], dtype=bool) In [16]: np.array([1,2,3]) == np.squeeze(np.array([[1],[2],[3]])) Out[16]: array([ True, True, True], dtype=bool) </code></pre>

Difference between these array shapes in numpy

Tags:

python

arrays

numpy

What is the difference between 2 arrays whose shapes are-

(442,1) and (442,) ?

Printing both of these produces an identical output, but when I check for equality ==, I get a 2D vector like this-

array([[ True, False, False, ..., False, False, False],
       [False,  True, False, ..., False, False, False],
       [False, False,  True, ..., False, False, False],
       ..., 
       [False, False, False, ...,  True, False, False],
       [False, False, False, ..., False,  True, False],
       [False, False, False, ..., False, False,  True]], dtype=bool)

Can someone explain the difference?

588

asked Dec 19 '14 17:12

goelakash

1 Answers

An array of shape (442, 1) is 2-dimensional. It has 442 rows and 1 column.

An array of shape (442, ) is 1-dimensional and consists of 442 elements.

Note that their reprs should look different too. There is a difference in the number and placement of parenthesis:

In [7]: np.array([1,2,3]).shape
Out[7]: (3,)

In [8]: np.array([[1],[2],[3]]).shape
Out[8]: (3, 1)

Note that you could use np.squeeze to remove axes of length 1:

In [13]: np.squeeze(np.array([[1],[2],[3]])).shape
Out[13]: (3,)

NumPy broadcasting rules allow new axes to be automatically added on the left when needed. So (442,) can broadcast to (1, 442). And axes of length 1 can broadcast to any length. So when you test for equality between an array of shape (442, 1) and an array of shape (442, ), the second array gets promoted to shape (1, 442) and then the two arrays expand their axes of length 1 so that they both become broadcasted arrays of shape (442, 442). This is why when you tested for equality the result was a boolean array of shape (442, 442).

In [15]: np.array([1,2,3]) == np.array([[1],[2],[3]])
Out[15]: 
array([[ True, False, False],
       [False,  True, False],
       [False, False,  True]], dtype=bool)

In [16]: np.array([1,2,3]) == np.squeeze(np.array([[1],[2],[3]]))
Out[16]: array([ True,  True,  True], dtype=bool)

165

answered Oct 19 '22 08:10

unutbu

Related questions
                            
                                Quicker to os.walk or glob?
                            
                                AWS Cognito as Django authentication back-end for web site
                            
                                Comparing XML in a unit test in Python
                            
                                does close() imply flush() in Python?
                            
                                ConfigParser vs. import config
                            
                                Django Debug Toolbar: understanding the time panel
                            
                                Python: intersection indices numpy array
                            
                                When to use or not use iterator() in the django ORM
                            
                                Difference between using requests.get() and requests.session().get()?
                            
                                Feature Importance Chart in neural network using Keras in Python
                            
                                Credentials in pip.conf for private PyPI
                            
                                ValueError: shape mismatch: objects cannot be broadcast to a single shape
                            
                                How to Bootstrap numpy installation in setup.py
                            
                                Is there an "ungroup by" operation opposite to .groupby in pandas?
                            
                                Difference between render_template and redirect?
                            
                                How does it work, the naming convention for Django INSTALLED_APPS?
                            
                                How do you debug Mako templates?
                            
                                Summing over a multiindex level in a pandas series
                            
                                Can't instantiate abstract class ... with abstract methods
                            
                                What does version name 'cp27' or 'cp35' mean in Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With