Tensorflow Convolution Neural Network with different sized images

Tags:

I am attempting to create a deep CNN that can classify each individual pixel in an image. I am replicating architecture from the image below taken from this paper. In the paper it is mentioned that deconvolutions are used so that any size of input is possible. This can be seen in the image below.

Github Repository

enter image description here

Currently, I have hard coded my model to accept images of size 32x32x7, but I would like to accept any size of input. What changes would I need to make to my code to accept variable sized input?

 x = tf.placeholder(tf.float32, shape=[None, 32*32*7])
 y_ = tf.placeholder(tf.float32, shape=[None, 32*32*7, 3])
 ...
 DeConnv1 = tf.nn.conv3d_transpose(layer1, filter = w, output_shape = [1,32,32,7,1], strides = [1,2,2,2,1], padding = 'SAME')
 ...
 final = tf.reshape(final, [1, 32*32*7])
 W_final = weight_variable([32*32*7,32*32*7,3])
 b_final = bias_variable([32*32*7,3])
 final_conv = tf.tensordot(final, W_final, axes=[[1], [1]]) + b_final

218

asked Jan 12 '18 16:01

Devin Haslam

1 Answers

Dynamic placeholders

Tensorflow allows to have multiple dynamic (a.k.a. None) dimensions in placeholders. The engine won't be able to ensure correctness while the graph is built, hence the client is responsible for feeding the correct input, but it provides a lot of flexibility.

So I'm going from...

x = tf.placeholder(tf.float32, shape=[None, N*M*P])
y_ = tf.placeholder(tf.float32, shape=[None, N*M*P, 3])
...
x_image = tf.reshape(x, [-1, N, M, P, 1])

to...

# Nearly all dimensions are dynamic
x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])

Since you intend to reshape the input to 5D anyway, so why don't use 5D in x_image right from the start. At this point, the second dimension of label is arbitrary, but we promise tensorflow that it will match with x_image.

Dynamic shapes in deconvolution

Next, the nice thing about tf.nn.conv3d_transpose is that its output shape can be dynamic. So instead of this:

# Hard-coded output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=[1,32,32,7,1], ...)

... you can do this:

# Dynamic output shape
DeConnv1 = tf.nn.conv3d_transpose(layer1, w, output_shape=tf.shape(x_image), ...)

This way the transpose convolution can be applied to any image and the result will take the shape of x_image that was actually passed in at runtime.

Note that static shape of x_image is (?, ?, ?, ?, 1).

All-Convolutional network

Final and most important piece of the puzzle is to make the whole network convolutional, and that includes your final dense layer too. Dense layer must define its dimensions statically, which forces the whole neural network fix input image dimensions.

Luckily for us, Springenberg at al describe a way to replace an FC layer with a CONV layer in "Striving for Simplicity: The All Convolutional Net" paper. I'm going to use a convolution with 3 1x1x1 filters (see also this question):

final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])

If we ensure that final has the same dimensions as DeConnv1 (and others), it'll make y right the shape we want: [-1, N * M * P, 3].

Combining it all together

Your network is pretty large, but all deconvolutions basically follow the same pattern, so I've simplified my proof-of-concept code to just one deconvolution. The goal is just to show what kind of network is able to handle images of arbitrary size. Final remark: image dimensions can vary between batches, but within one batch they have to be the same.

The full code:

sess = tf.InteractiveSession()

def conv3d_dilation(tempX, tempFilter):
  return tf.layers.conv3d(tempX, filters=tempFilter, kernel_size=[3, 3, 1], strides=1, padding='SAME', dilation_rate=2)

def conv3d(tempX, tempW):
  return tf.nn.conv3d(tempX, tempW, strides=[1, 2, 2, 2, 1], padding='SAME')

def conv3d_s1(tempX, tempW):
  return tf.nn.conv3d(tempX, tempW, strides=[1, 1, 1, 1, 1], padding='SAME')

def weight_variable(shape):
  initial = tf.truncated_normal(shape, stddev=0.1)
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

def max_pool_3x3(x):
  return tf.nn.max_pool3d(x, ksize=[1, 3, 3, 3, 1], strides=[1, 2, 2, 2, 1], padding='SAME')

x_image = tf.placeholder(tf.float32, shape=[None, None, None, None, 1])
label = tf.placeholder(tf.float32, shape=[None, None, 3])

W_conv1 = weight_variable([3, 3, 1, 1, 32])
h_conv1 = conv3d(x_image, W_conv1)
# second convolution
W_conv2 = weight_variable([3, 3, 4, 32, 64])
h_conv2 = conv3d_s1(h_conv1, W_conv2)
# third convolution path 1
W_conv3_A = weight_variable([1, 1, 1, 64, 64])
h_conv3_A = conv3d_s1(h_conv2, W_conv3_A)
# third convolution path 2
W_conv3_B = weight_variable([1, 1, 1, 64, 64])
h_conv3_B = conv3d_s1(h_conv2, W_conv3_B)
# fourth convolution path 1
W_conv4_A = weight_variable([3, 3, 1, 64, 96])
h_conv4_A = conv3d_s1(h_conv3_A, W_conv4_A)
# fourth convolution path 2
W_conv4_B = weight_variable([1, 7, 1, 64, 64])
h_conv4_B = conv3d_s1(h_conv3_B, W_conv4_B)
# fifth convolution path 2
W_conv5_B = weight_variable([1, 7, 1, 64, 64])
h_conv5_B = conv3d_s1(h_conv4_B, W_conv5_B)
# sixth convolution path 2
W_conv6_B = weight_variable([3, 3, 1, 64, 96])
h_conv6_B = conv3d_s1(h_conv5_B, W_conv6_B)
# concatenation
layer1 = tf.concat([h_conv4_A, h_conv6_B], 4)
w = tf.Variable(tf.constant(1., shape=[2, 2, 4, 1, 192]))
DeConnv1 = tf.nn.conv3d_transpose(layer1, filter=w, output_shape=tf.shape(x_image), strides=[1, 2, 2, 2, 1], padding='SAME')

final = DeConnv1
final_conv = conv3d_s1(final, weight_variable([1, 1, 1, 1, 3]))
y = tf.reshape(final_conv, [-1, 3])
cross_entropy = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=label, logits=y))

print('x_image:', x_image)
print('DeConnv1:', DeConnv1)
print('final_conv:', final_conv)

def try_image(N, M, P, B=1):
  batch_x = np.random.normal(size=[B, N, M, P, 1])
  batch_y = np.ones([B, N * M * P, 3]) / 3.0

  deconv_val, final_conv_val, loss = sess.run([DeConnv1, final_conv, cross_entropy],
                                              feed_dict={x_image: batch_x, label: batch_y})
  print(deconv_val.shape)
  print(final_conv.shape)
  print(loss)
  print()

tf.global_variables_initializer().run()
try_image(32, 32, 7)
try_image(16, 16, 3)
try_image(16, 16, 3, 2)

196

answered Oct 06 '22 00:10

Maxim

Related questions
                            
                                Reading YAML config file in python and using variables
                            
                                Pandas - Group/bins of data per longitude/latitude
                            
                                Logging not captured on behave steps
                            
                                Why is NotImplemented evaluated multiple times with __eq__ operator
                            
                                Python Spectrum Analysis
                            
                                How to remove index list from another list in python? [duplicate]
                            
                                Storing JSON into database in python
                            
                                QuerySet is not JSON Serializable Django
                            
                                Faster numpy-solution instead of itertools.combinations?
                            
                                Is it possible to run SLURM jobs in the background using SRUN instead of SBATCH?
                            
                                Sort pandas DataFrame with MultiIndex according to column value
                            
                                Where do I put a conda-recipe relative to my project?
                            
                                Should binary features be one-hot encoded?
                            
                                Creating a structured array from a list
                            
                                Using Windows GPS Location Service in a Python Script
                            
                                CNN gives biased results
                            
                                How to replace certain values in Tensorflow tensor with the values of the other tensor?
                            
                                ImportError: No module named utils
                            
                                How to calculate pairwise distance matrix on the GPU
                            
                                boto EMR add step and auto terminate

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Tensorflow Convolution Neural Network with different sized images

Tags:

python

tensorflow

deep-learning

conv-neural-network

deconvolution