I was trying to understand some basics about the tensorflow and I got stuck while reading documentation for max pooling 2D layer: https://www.tensorflow.org/tutorials/layers#pooling_layer_1 This is how max_pooling2d is specified: <code>pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)</code> where <code>conv1</code> has a tensor with shape <code>[batch_size, image_width, image_height, channels]</code>, concretely in this case it's <code>[batch_size, 28, 28, 32]</code>. So our input is a tensor with shape: <code>[batch_size, 28, 28, 32]</code>. My understanding of a max pooling 2D layer is that it will apply a filter of size <code>pool_size</code> (2x2 in this case) and moving sliding window by <code>stride</code> (also 2x2). This means that both <code>width</code> and <code>height</code> of the image will be halfed, i.e. we will end up with 14x14 pixels per channel (32 channels in total), meaning our output is a tensor with shape: <code>[batch_size, 14, 14, 32]</code>. However, according to the above link, the shape of the output tensor is <code>[batch_size, 14, 14, 1]</code>: <pre class="prettyprint"><code>Our output tensor produced by max_pooling2d() (pool1) has a shape of [batch_size, 14, 14, 1]: the 2x2 filter reduces width and height by 50%. </code></pre> What am I missing here? How was 32 converted to 1? They apply the same logic later here: https://www.tensorflow.org/tutorials/layers#convolutional_layer_2_and_pooling_layer_2 but this time it's correct, i.e. <code>[batch_size, 14, 14, 64]</code> becomes <code>[batch_size, 7, 7, 64]</code> (number of channels is the same).

Nikola, it has been corrected as you thought. <ul> <li>Documentation fixes for TF Layers tutorial (see #8301)</li> <li>Feedback on "A Guide to TF Layers: Building a Convolutional Neural Network" tutorial #8301</li> </ul> Learning the concept of convolution and pooling, I come across this thread. Thank you for your question, which takes me to the informative documentation.

What is output tensor of Max Pooling 2D Layer in TensorFlow?

Tags:

python

tensorflow

max-pooling

I was trying to understand some basics about the tensorflow and I got stuck while reading documentation for max pooling 2D layer: https://www.tensorflow.org/tutorials/layers#pooling_layer_1

This is how max_pooling2d is specified:

pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], strides=2)

where conv1 has a tensor with shape [batch_size, image_width, image_height, channels], concretely in this case it's [batch_size, 28, 28, 32].

So our input is a tensor with shape: [batch_size, 28, 28, 32].

My understanding of a max pooling 2D layer is that it will apply a filter of size pool_size (2x2 in this case) and moving sliding window by stride (also 2x2). This means that both width and height of the image will be halfed, i.e. we will end up with 14x14 pixels per channel (32 channels in total), meaning our output is a tensor with shape: [batch_size, 14, 14, 32].

However, according to the above link, the shape of the output tensor is [batch_size, 14, 14, 1]:

Our output tensor produced by max_pooling2d() (pool1) has a shape of 
[batch_size, 14, 14, 1]: the 2x2 filter reduces width and height by 50%.

What am I missing here?

How was 32 converted to 1?

They apply the same logic later here: https://www.tensorflow.org/tutorials/layers#convolutional_layer_2_and_pooling_layer_2

but this time it's correct, i.e. [batch_size, 14, 14, 64] becomes [batch_size, 7, 7, 64] (number of channels is the same).

953

asked Apr 17 '17 14:04

Nikola Stojiljkovic

2 Answers

Yes, use 2x2 max pool with strides=2x2 will reduce data to a half, and the output depth will not be changed. This is my test code from your given, the output shape is (14, 14, 32), maybe something wrong?

#!/usr/bin/env python

import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data

mnist = input_data.read_data_sets('./MNIST_data/', one_hot=True)

conv1 = tf.placeholder(tf.float32, [None,28,28,32])
pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2,2], strides=2)
print pool1.get_shape()

the output is:

Extracting ./MNIST_data/train-images-idx3-ubyte.gz
Extracting ./MNIST_data/train-labels-idx1-ubyte.gz
Extracting ./MNIST_data/t10k-images-idx3-ubyte.gz
Extracting ./MNIST_data/t10k-labels-idx1-ubyte.gz
(?, 14, 14, 32)

answered Oct 13 '22 00:10

大宝剑

Nikola, it has been corrected as you thought.

Documentation fixes for TF Layers tutorial (see #8301)
Feedback on "A Guide to TF Layers: Building a Convolutional Neural Network" tutorial #8301

Learning the concept of convolution and pooling, I come across this thread. Thank you for your question, which takes me to the informative documentation.

answered Oct 13 '22 01:10

Tora

Related questions
                            
                                Class imported from two different paths is not equal?
                            
                                Fix 'new enumerations must be created as'
                            
                                Python eval doesn't work inside a function [duplicate]
                            
                                Python OpenCV - overlay an image with transparency
                            
                                Pandas DataFrame iloc spoils the data type
                            
                                IOError: [Errno 2] No such file or directory: 'README.md'
                            
                                How can I send Json Data from javaScript to Flask
                            
                                Flask application GET returning the same thing twice
                            
                                Jupyter Notebook: Timeout waiting for kernel_info_reply
                            
                                Undefined variable from import when using vtk
                            
                                Python tf-idf: fast way to update the tf-idf matrix
                            
                                Python pyodbc cursor vs database cursor
                            
                                making requests to localhost from inside docker container
                            
                                Keras: Tokenizer with fit_generator() on text data
                            
                                'numpy.ndarray' object has no attribute 'count'
                            
                                Pandas - Duplicate Row based on condition
                            
                                Peek of multiprocessing.queue?
                            
                                Pandas/matplotlib plot with date-axis shows correct day/month but wrong weekday/year
                            
                                URL based database routing
                            
                                How to send data via POST or GET in Mod_Python?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With