I have the following parameters defined for doing a max pool over the depth of the image (rgb) for compression before the dense layer and readout...and I am failing with an error that I cannot pool over depth and everything else:
sunset_poolmax_1x1x3_div_2x2x3_params = \
'padding': 'SAME'}
I changed the strides to [1,1,1,3]
so that depth is the only dimension reduced by the pool...but it still doesn't work. I can't get good results with the tiny image I have to compress everything to in order to keep the colors...
Actual Error:
ValueError: Current implementation does not support pooling in the batch and depth dimensions.
Max pooling operation for 2D spatial data. Downsamples the input along its spatial dimensions (height and width) by taking the maximum value over an input window (of size defined by pool_size ) for each channel of the input. The window is shifted by strides along each dimension.
Max Pooling is a pooling operation that calculates the maximum value for patches of a feature map, and uses it to create a downsampled (pooled) feature map. It is usually used after a convolutional layer.
2-D Average Pooling Layer The dimensions that the layer pools over depends on the layer input: For 2-D image input (data with four dimensions corresponding to pixels in two spatial dimensions, the channels, and the observations), the layer pools over the spatial dimensions.
tf.nn.max_pool does not support pooling over the depth dimension which is why you get an error.
You can use a max reduction instead to achieve what you're looking for:
tf.reduce_max(input_tensor, reduction_indices=[3], keep_dims=True)
The keep_dims
parameter above ensures that the rank of the tensor is preserved. This ensures that the behavior of the max reduction will be consistent with what the tf.nn.max_pool operation would do if it supported pooling over the depth dimension.
TensorFlow now supports depth-wise max pooling with tf.nn.max_pool()
. For example, here is how to implement it using pooling kernel size 3, stride 3 and VALID padding:
import tensorflow as tf
output = tf.nn.max_pool(images,
ksize=(1, 1, 1, 3),
strides=(1, 1, 1, 3),
You can use this in a Keras model by wrapping it in a Lambda
from tensorflow import keras
depth_pool = keras.layers.Lambda(
lambda X: tf.nn.max_pool(X,
ksize=(1, 1, 1, 3),
strides=(1, 1, 1, 3),
model = keras.models.Sequential([
..., # other layers
... # other layers
Alternatively, you can write a custom Keras layer:
class DepthMaxPool(keras.layers.Layer):
def __init__(self, pool_size, strides=None, padding="VALID", **kwargs):
if strides is None:
strides = pool_size
self.pool_size = pool_size
self.strides = strides
self.padding = padding
def call(self, inputs):
return tf.nn.max_pool(inputs,
ksize=(1, 1, 1, self.pool_size),
strides=(1, 1, 1, self.pool_size),
You can then use it like any other layer:
model = keras.models.Sequential([
..., # other layers
... # other layers
Here is a brief example to the original question for tensorflow. I tested it on a stock RGB image of size 225 x 225
with 3 channels.
Import the standard libraries, enable eager_execution
to quickly view results
import tensorflow as tf
from scipy.misc import imread
import matplotlib.pyplot as plt
import numpy as np
Read image and cast from uint8
to tf.float32
x = tf.cast(imread('tiger.jpeg'), tf.float32)
x = tf.reshape(x, shape=[-1, x.shape[0], x.shape[1], x.shape[2]])
input_channels = x.shape[3]
Create the filter for depthwise convolution
filters = tf.contrib.eager.Variable(tf.random_normal(shape=[3, 3, input_channels, 4]))
Perform depthwise convolution with channel multiplier
4. Note the the padding has been kept to 'SAME'
. It can be changed at will.
x = tf.nn.depthwise_conv2d(input=x, filter=filters, strides=[1, 1, 1, 1], padding='SAME', name='conv_1')
Perform the max_pooling2d
. Since the output of the pooling layer is (input_size - pool_size + 2 * padding)/stride + 1
and the padding is 'valid'
, we should get an output of (225 - 2 + 0)/1 + 1 = 223
x = tf.layers.max_pooling2d(inputs=x, pool_size=2, strides=1,padding='valid', name='maxpool1')
Plot the figures to confirm.
fig, ax = plt.subplots(nrows=4, ncols=3)
q = 0
for ii in range(4):
for jj in range(3):
ax[ii, jj].imshow(np.squeeze(x[:,:,:,q]))
q += 1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With