Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why conv2d in tensorflow gives an output has the same shape as input

According to this Deep Learning course http://cs231n.github.io/convolutional-networks/#conv, It says that if there is an input x with shape [W,W] (where W = width = height) goes through a Convolutional Layer with filter shape [F,F]and stride S, the Layer will return an output with shape [(W-F)/S +1, (W-F)/S +1]

However, when I'm trying to follow the tutorial of the Tensorflow: https://www.tensorflow.org/versions/r0.11/tutorials/mnist/pros/index.html. There seems to have difference of the function tf.nn.conv2d(inputs, filter, stride)

Whatever how do I change my filter size, conv2d will constantly return me a value with the same shape as the input.

In my case, I am using the MNIST dataset which indicates that every image has size [28,28](ignoring channel_num = 1)

but after I defining the first conv1 layers, I used the conv1.get_shape() to see its output, it gives me [28,28, num_of_filters]

Why is this? I thought the return value should follow the formula above.


Appendix: Code snippet

#reshape x from 2d to 4d

x_image = tf.reshape(x, [-1, 28, 28, 1]) #[num_samples, width, height, channel_num]

## define the shape of weights and bias
w_shape = [5, 5, 1, 32] #patch_w, patch_h, in_channel, output_num(out_channel)
b_shape =          [32] #bias only need to be consistent with output_num

## init weights of conv1 layers
W_conv1 = weight_variable(w_shape)
b_conv1 = bias_variable(b_shape)

## first layer x_iamge->conv1/relu->pool1

#Our convolutions uses a stride of one 
#and are zero padded 
#so that the output is the same size as the input
h_conv1 = tf.nn.relu(
    conv2d(x_image, W_conv1) + b_conv1
                    )

print 'conv1.shape=',h_conv1.get_shape() 
## conv1.shape= (?, 28, 28, 32) 
## I thought conv1.shape should be (?, (28-5)/1+1, 24 ,32)

h_pool1 = max_pool_2x2(h_conv1) #output 32 num
print 'pool1.shape=',h_pool1.get_shape() ## pool1.shape= (?, 14, 14, 32)
like image 458
Cuo Show Avatar asked Oct 23 '16 17:10

Cuo Show


1 Answers

It depends on the padding parameter. 'SAME' will keep the output as WxW (assuming stride=1,) 'VALID' will shrink the size of the output to (W-F+1)x(W-F+1)

like image 81
MMN Avatar answered Sep 24 '22 15:09

MMN