I have some trouble understanding what numpy's dstack
function is actually doing. The documentation is rather sparse and just says:
Stack arrays in sequence depth wise (along third axis).
Takes a sequence of arrays and stack them along the third axis to make a single array. Rebuilds arrays divided by
dsplit
. This is a simple way to stack 2D arrays (images) into a single 3D array for processing.
So either I am really stupid and the meaning of this is obvious or I seem to have some misconception about the terms 'stacking', 'in sequence', 'depth wise' or 'along an axis'. However, I was of the impression that I understood these terms in the context of vstack
and hstack
just fine.
Let's take this example:
In [193]: a Out[193]: array([[0, 3], [1, 4], [2, 5]]) In [194]: b Out[194]: array([[ 6, 9], [ 7, 10], [ 8, 11]]) In [195]: dstack([a,b]) Out[195]: array([[[ 0, 6], [ 3, 9]], [[ 1, 7], [ 4, 10]], [[ 2, 8], [ 5, 11]]])
First of all, a
and b
don't have a third axis so how would I stack them along 'the third axis' to begin with? Second of all, assuming a
and b
are representations of 2D-images, why do I end up with three 2D arrays in the result as opposed to two 2D-arrays 'in sequence'?
dstack() function. The dstack() is used to stack arrays in sequence depth wise (along third axis). This is equivalent to concatenation along the third axis after 2-D arrays of shape (M,N) have been reshaped to (M,N,1) and 1-D arrays of shape (N,) have been reshaped to (1,N,1).
The vstack() function is used to stack arrays in sequence vertically (row wise). This is equivalent to concatenation along the first axis after 1-D arrays of shape (N,) have been reshaped to (1,N). The arrays must have the same shape along all but the first axis. 1-D arrays must have the same length.
Simply put, numpy. newaxis is used to increase the dimension of the existing array by one more dimension, when used once. Thus, 1D array will become 2D array.
It's easier to understand what np.vstack
, np.hstack
and np.dstack
* do by looking at the .shape
attribute of the output array.
Using your two example arrays:
print(a.shape, b.shape) # (3, 2) (3, 2)
np.vstack
concatenates along the first dimension...
print(np.vstack((a, b)).shape) # (6, 2)
np.hstack
concatenates along the second dimension...
print(np.hstack((a, b)).shape) # (3, 4)
and np.dstack
concatenates along the third dimension.
print(np.dstack((a, b)).shape) # (3, 2, 2)
Since a
and b
are both two dimensional, np.dstack
expands them by inserting a third dimension of size 1. This is equivalent to indexing them in the third dimension with np.newaxis
(or alternatively, None
) like this:
print(a[:, :, np.newaxis].shape) # (3, 2, 1)
If c = np.dstack((a, b))
, then c[:, :, 0] == a
and c[:, :, 1] == b
.
You could do the same operation more explicitly using np.concatenate
like this:
print(np.concatenate((a[..., None], b[..., None]), axis=2).shape) # (3, 2, 2)
* Importing the entire contents of a module into your global namespace using import *
is considered bad practice for several reasons. The idiomatic way is to import numpy as np
.
Let x == dstack([a, b])
. Then x[:, :, 0]
is identical to a
, and x[:, :, 1]
is identical to b
. In general, when dstacking 2D arrays, dstack produces an output such that output[:, :, n]
is identical to the nth input array.
If we stack 3D arrays rather than 2D:
x = numpy.zeros([2, 2, 3]) y = numpy.ones([2, 2, 4]) z = numpy.dstack([x, y])
then z[:, :, :3]
would be identical to x
, and z[:, :, 3:7]
would be identical to y
.
As you can see, we have to take slices along the third axis to recover the inputs to dstack
. That's why dstack
behaves the way it does.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With