Observe:
In [1]: import numpy as np
In [2]: x = np.array([1, 2, 3])
In [3]: np.vstack([x, x])
Out[3]:
array([[1, 2, 3],
[1, 2, 3]])
In [4]: np.vstack(np.broadcast(x, x))
Out[4]:
array([[1, 1],
[2, 2],
[3, 3]])
Similarly for column_stack
and row_stack
(hstack
behaves differently in this case but it also differs when used with broadcast). Why?
I'm after the logic behind that rather than finding a way of "repairing" this behavior (I'm just fine with it, it's just unintuitive).
The transpose () function in the numpy library is mainly used to reverse or permute the axes of an array and then it will return the modified array. The main task of this function is to change the column elements into the row elements and the column elements into the row elements.
Both the arrays (original and transposed) have the same shape. For a 2d array, the transpose operation means to swap out the rows and columns of the array. Here’s an example – Here, we transpose a 2×3 array.
Because NumPy array are implemented as contiguous blocks in memory. When we index something like a [ [2,7,11]], the objects at the indices 2, 7 and 11 are stored in a non-contiguous manner. You can not have the elements of the new array lined up in a contiguous manner unless you make a copy.
Taking advantage of the way NumPy stores it's arrays, we can r e shape NumPy arrays without incurring any significant computational cost as it merely involves changing strides for the array. The array, which is stored in a contiguous manner in the memory, does not change.
np.broadcast
returns an instance of an iterator object that describes how the arrays should be broadcast together.1 Among other things, it describes the shape and the number of dimensions that the resulting array will have.
Crucially, when you actually iterate over this object in Python you get back tuples of elements from each input array:
>>> b = np.broadcast(x, x)
>>> b.shape
(3,)
>>> b.ndim
1
>>> list(b)
[(1, 1), (2, 2), (3, 3)]
This tells us that if we were performing an actual operation on the arrays (say, x+x
) NumPy would return an array of shape (3,)
, one dimension and combine the elements in the tuple to produce the values in the final array (e.g. it would perform 1+1
, 2+2
, 3+3
for the addition).
If you dig in to the source of vstack
you find that all it does is make sure the elements of the iterable that it has been given are at least two-dimensional, and then stack them along axis 0.
In the case of b = np.broadcast(x, x)
this means that we get the following arrays to stack:
>>> [np.atleast_2d(_m) for _m in b]
[array([[1, 1]]), array([[2, 2]]), array([[3, 3]])]
These three small arrays are then stacked vertically producing the output you note.
1 Exactly how arrays of varying dimensions are iterated over in parallel is at the very heart of how NumPy's broadcasting works. The code can be found mostly in iterators.c. An interesting overview of NumPy's multidimensional iterator, written by Travis Oliphant himself, can be found in the Beautiful Code book.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With