I am new to deep learning and Tensorflow and have to learn this topic due to a project I am currently working on. I am using convolutional network to detect and find the location of a single object in the image. I am using the method introduced in Standford CS231n class. The lecturer mentioned about connecting a regression head after the fully connected layer in the network to find the location of the object. I know there is <code>DNNRegressor</code> in Tensorflow. Should I use this as the regression head? Before I modified Tensorflow's tutorial on using ConvNet to recognize handwritten digit for my case. I am not too sure how can I add the regression head to that program so that it can also find a bounding box for the object. I just had the chance to touch machine learning and deep learning this week, apology if I asked a really silly question, but I really need to find a solution to my problem. Thank you very much.

I was looking for this problem as well and I found the following part in the document. <blockquote> Dense (fully connected) layers, which perform classification on the features extracted by the convolutional layers and downsampled by the pooling layers. In a dense layer, every node in the layer is connected to every node in the preceding layer. </blockquote> Based on this quote, <s>it seems you can't do regression but classification</s>. EDIT: After some research, I found out a way to use a <code>fully-connected</code> layer in <code>tensorflow</code>. <pre class="prettyprint"><code>import tensorflow.contrib.slim as slim #create your network **net**. #In the last step, you should use y_prime = slim.fully_connected(net, 1, activation_fn=None, reuse=reuse) loss = tf.reduce_mean(tf.square(y_prime - y)) #L2 norm lr = tf.placeholder(tf.float32) opt = tf.train.AdamOptimizer(learning_rate=lr).minimize(loss) </code></pre> You can add more <code>fully connected</code> layers before the last step which can have more nodes.

How to add a regression head after the fully connected layer in convolutional network using Tensorflow?

Tags:

machine-learning

tensorflow

deep-learning

computer-vision

conv-neural-network

I am new to deep learning and Tensorflow and have to learn this topic due to a project I am currently working on. I am using convolutional network to detect and find the location of a single object in the image. I am using the method introduced in Standford CS231n class. The lecturer mentioned about connecting a regression head after the fully connected layer in the network to find the location of the object. I know there is DNNRegressor in Tensorflow. Should I use this as the regression head?

Before I modified Tensorflow's tutorial on using ConvNet to recognize handwritten digit for my case. I am not too sure how can I add the regression head to that program so that it can also find a bounding box for the object.

I just had the chance to touch machine learning and deep learning this week, apology if I asked a really silly question, but I really need to find a solution to my problem. Thank you very much.

954

asked Jul 07 '17 05:07

DakoDako

2 Answers

First of all, in order to train a neural network for object localization task, you have to have a data set with localized objects. This answers your question whether you can work with MNIST data set or not. MNIST contains just a class label for each image, so you need to get another data set. Justin also talks about popular data sets at around 37:34.

The way object localization works is by learning to output 4 values per image, instead of class distribution. This four-valued vector is compared to the ground truth four-valued vector and the loss function is usually L1 or L2 norm of their difference. So in code, regression head is an ordinary regression layer, which can be implemented in tensorflow by a simple tf.reduce_mean call.

A small yet complete example that performs object localization can be found here. Also recommend to take a look at this question.

188

answered Sep 18 '22 16:09

Maxim

I was looking for this problem as well and I found the following part in the document.

Dense (fully connected) layers, which perform classification on the features extracted by the convolutional layers and downsampled by the pooling layers. In a dense layer, every node in the layer is connected to every node in the preceding layer.

Based on this quote, ~~it seems you can't do regression but classification~~.

EDIT: After some research, I found out a way to use a fully-connected layer in tensorflow.

import tensorflow.contrib.slim as slim

#create your network **net**. 

#In the last step, you should use 
y_prime = slim.fully_connected(net, 1, activation_fn=None, reuse=reuse)

loss = tf.reduce_mean(tf.square(y_prime - y)) #L2 norm
lr = tf.placeholder(tf.float32)
opt = tf.train.AdamOptimizer(learning_rate=lr).minimize(loss)

You can add more fully connected layers before the last step which can have more nodes.

answered Sep 20 '22 16:09

smttsp

Related questions
                            
                                How to use sklearn Pipeline with custom Features?
                            
                                Caffe sigmoid cross entropy loss
                            
                                How to obtain a confidence interval or a measure of prediction dispersion when using xgboost for classification?
                            
                                SVM - Difference between Energy vs Loss vs Regularization vs Cost function
                            
                                Keras RNN loss does not decrease over epoch
                            
                                Difference between LinearRegression() and Ridge(alpha=0)
                            
                                Image resizing method during preprocessing for neural network
                            
                                GridSearch with Keras Neural Networks
                            
                                Gradient calculation in Hamming loss for multi-label classification
                            
                                Dimension mismatch error in Spark ML
                            
                                How to save the encoded output in Keras
                            
                                tf.cond lowers the training speed
                            
                                How to convert Euclidean distance to range 0 and 1 like Cosine Similarity?
                            
                                Is it possible to get the objective function value during each training step?
                            
                                Binary Crossentropy to penalize all components of one-hot vector
                            
                                Is it possible to certify an AI-based solution for safety-critical systems? [closed]
                            
                                Least Squares method in practice
                            
                                Deep Learning an Imbalanced data set
                            
                                Using keras tokenizer for new words not in training set
                            
                                Why is binary_crossentropy more accurate than categorical_crossentropy for multiclass classification in Keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With