Observe: <pre class="prettyprint"><code>In [1]: import numpy as np In [2]: x = np.array([1, 2, 3]) In [3]: np.vstack([x, x]) Out[3]: array([[1, 2, 3], [1, 2, 3]]) In [4]: np.vstack(np.broadcast(x, x)) Out[4]: array([[1, 1], [2, 2], [3, 3]]) </code></pre> Similarly for <code>column_stack</code> and <code>row_stack</code> (<code>hstack</code> behaves differently in this case but it also differs when used with broadcast). Why? I'm after the logic behind that rather than finding a way of "repairing" this behavior (I'm just fine with it, it's just unintuitive).

<code>np.broadcast</code> returns an instance of an iterator object that describes how the arrays should be broadcast together.1 Among other things, it describes the shape and the number of dimensions that the resulting array will have. Crucially, when you actually iterate over this object in Python you get back tuples of elements from each input array: <pre class="prettyprint"><code>>>> b = np.broadcast(x, x) >>> b.shape (3,) >>> b.ndim 1 >>> list(b) [(1, 1), (2, 2), (3, 3)] </code></pre> This tells us that if we were performing an actual operation on the arrays (say, <code>x+x</code>) NumPy would return an array of shape <code>(3,)</code>, one dimension and combine the elements in the tuple to produce the values in the final array (e.g. it would perform <code>1+1</code>, <code>2+2</code>, <code>3+3</code> for the addition). If you dig in to the source of <code>vstack</code> you find that all it does is make sure the elements of the iterable that it has been given are at least two-dimensional, and then stack them along axis 0. In the case of <code>b = np.broadcast(x, x)</code> this means that we get the following arrays to stack: <pre class="prettyprint"><code>>>> [np.atleast_2d(_m) for _m in b] [array([[1, 1]]), array([[2, 2]]), array([[3, 3]])] </code></pre> These three small arrays are then stacked vertically producing the output you note. <hr> 1 Exactly how arrays of varying dimensions are iterated over in parallel is at the very heart of how NumPy's broadcasting works. The code can be found mostly in iterators.c. An interesting overview of NumPy's multidimensional iterator, written by Travis Oliphant himself, can be found in the Beautiful Code book.

Why does numpy.broadcast "transpose" results of vstack and similar functions?

Tags:

python

arrays

numpy

array-broadcasting

Observe:

In [1]: import numpy as np
In [2]: x = np.array([1, 2, 3])
In [3]: np.vstack([x, x])
Out[3]: 
array([[1, 2, 3],
       [1, 2, 3]])

In [4]: np.vstack(np.broadcast(x, x))
Out[4]: 
array([[1, 1],
       [2, 2],
       [3, 3]])

Similarly for column_stack and row_stack (hstack behaves differently in this case but it also differs when used with broadcast). Why?

I'm after the logic behind that rather than finding a way of "repairing" this behavior (I'm just fine with it, it's just unintuitive).

988

asked Mar 27 '16 22:03

zegkljan

1 Answers

np.broadcast returns an instance of an iterator object that describes how the arrays should be broadcast together.¹ Among other things, it describes the shape and the number of dimensions that the resulting array will have.

Crucially, when you actually iterate over this object in Python you get back tuples of elements from each input array:

>>> b = np.broadcast(x, x)
>>> b.shape
(3,)
>>> b.ndim
1
>>> list(b)
[(1, 1), (2, 2), (3, 3)]

This tells us that if we were performing an actual operation on the arrays (say, x+x) NumPy would return an array of shape (3,), one dimension and combine the elements in the tuple to produce the values in the final array (e.g. it would perform 1+1, 2+2, 3+3 for the addition).

If you dig in to the source of vstack you find that all it does is make sure the elements of the iterable that it has been given are at least two-dimensional, and then stack them along axis 0.

In the case of b = np.broadcast(x, x) this means that we get the following arrays to stack:

>>> [np.atleast_2d(_m) for _m in b]
[array([[1, 1]]), array([[2, 2]]), array([[3, 3]])]

These three small arrays are then stacked vertically producing the output you note.

¹ Exactly how arrays of varying dimensions are iterated over in parallel is at the very heart of how NumPy's broadcasting works. The code can be found mostly in iterators.c. An interesting overview of NumPy's multidimensional iterator, written by Travis Oliphant himself, can be found in the Beautiful Code book.

133

answered Sep 23 '22 19:09

Alex Riley

Related questions
                            
                                How can I enumerate rows in groups with Spark/Python?
                            
                                How can I get the Python compiler string programmatically?
                            
                                Multiindex only some of columns in Pandas
                            
                                Create a method attribute in a class
                            
                                Setting values with multiindex in pandas
                            
                                Docker. No such file or directory
                            
                                Messed up numpy installation - `GFORTRAN_1.4' not found bug
                            
                                Accessing rows of an array, inside an array of arrays?
                            
                                Name columns when importing csv to dataframe in dask
                            
                                Python - How to handle HTTPS request with (Urllib2 + SSL) though a HTTP proxy
                            
                                How to use sklearn Pipeline with custom Features?
                            
                                How to save a NumPy array as 16-bit single channel PNG image? [duplicate]
                            
                                Working in Pandas with variable names with a common suffix
                            
                                Write a dictionary of lists to csv in Python [duplicate]
                            
                                Pass a variable in IPython / Jupyter to a block of html (%%html)
                            
                                Why the set is ordered when converting a list to set?
                            
                                Does Spark SQL do predicate pushdown on filtered equi-joins?
                            
                                Wrapping around a list as a slice operation
                            
                                Python list error: [::-1] step on [:-1] slice
                            
                                Admin Site: TemplateDoesNotExist at /admin/

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With