What is the best way to convert a tensor from NHWC format to NCHW format, and vice versa?
Is there an op specifically that does this, or will I need to use some combination of the split/concat type operations?
From NHWC to NCHW The image shape is (N, H, W, C) and we want the output to have shape (N, C, H, W) . Therefore we need to apply tf. transpose with a well chosen permutation perm .
Data Format NHWC (N, Height, width, channel) is the TensorFlow default and NCHW is the optimal format to use for NVIDIA cuDNN. If TensorFlow is compiled with the Intel MKL optimizations, many operations will be optimized and support NCHW. Otherwise, some operations are not supported on CPU when using NCHW.
NCHW stands for: batch N, channels C, depth D, height H, width W. It is a way to store multidimensional arrays / data frames / matrix into memory, which can be considered as a 1-D array.
All you need to do is a permutation of the dimensions from NHWC to NCHW (or the contrary).
The meaning of each letter might help understand:
The image shape is (N, H, W, C)
and we want the output to have shape (N, C, H, W)
. Therefore we need to apply tf.transpose
with a well chosen permutation perm
.
The returned tensor's dimension
i
will correspond to the input dimensionperm[i]
perm[0] = 0 # output dimension 0 will be 'N', which was dimension 0 in the input
perm[1] = 3 # output dimension 1 will be 'C', which was dimension 3 in the input
perm[2] = 1 # output dimension 2 will be 'H', which was dimension 1 in the input
perm[3] = 2 # output dimension 3 will be 'W', which was dimension 2 in the input
In practice:
images_nhwc = tf.placeholder(tf.float32, [None, 200, 300, 3]) # input batch
out = tf.transpose(images_nhwc, [0, 3, 1, 2])
print(out.get_shape()) # the shape of out is [None, 3, 200, 300]
The image shape is (N, C, H, W)
and we want the output to have shape (N, H, W, C)
. Therefore we need to apply tf.transpose
with a well chosen permutation perm
.
The returned tensor's dimension
i
will correspond to the input dimensionperm[i]
perm[0] = 0 # output dimension 0 will be 'N', which was dimension 0 in the input
perm[1] = 2 # output dimension 1 will be 'H', which was dimension 2 in the input
perm[2] = 3 # output dimension 2 will be 'W', which was dimension 3 in the input
perm[3] = 1 # output dimension 3 will be 'C', which was dimension 1 in the input
In practice:
images_nchw = tf.placeholder(tf.float32, [None, 3, 200, 300]) # input batch
out = tf.transpose(images_nchw, [0, 2, 3, 1])
print(out.get_shape()) # the shape of out is [None, 200, 300, 3]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With