Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is tf.nn.max_pool's ksize parameter used for?

Tags:

In the definition of tf.nn.max_pool, what is ksize used for?

tf.nn.max_pool(value, ksize, strides, padding, data_format='NHWC', name=None)  Performs the max pooling on the input.  Args:  value: A 4-D Tensor with shape [batch, height, width, channels] and type    tf.float32. ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor. 

For instance, if an input value is of tensor : [1, 64, 64, 3] and ksize=3.what does that mean?

like image 825
user288609 Avatar asked Jul 26 '16 23:07

user288609


1 Answers

The documentation states:

ksize: A list of ints that has length >= 4. The size of the window for each dimension of the input tensor.

In general for images, your input is of shape [batch_size, 64, 64, 3] for an RGB image of 64x64 pixels.

The kernel size ksize will typically be [1, 2, 2, 1] if you have a 2x2 window over which you take the maximum. On the batch size dimension and the channels dimension, ksize is 1 because we don't want to take the maximum over multiple examples, or over multiples channels.

like image 117
Olivier Moindrot Avatar answered Sep 24 '22 13:09

Olivier Moindrot