Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CNN Image Recognition with Regression Output on Tensorflow

I want to predict the estimated wait time based on images using a CNN. So I would imagine that this would use a CNN to output a regression type output using a loss function of RMSE which is what I am using right now, but it is not working properly.

Can someone point out examples that use CNN image recognition to output a scalar/regression output (instead of a class output) similar to wait time so that I can use their techniques to get this to work because I haven't been able to find a suitable example.

All of the CNN examples that I found are for the MSINT data and distinguishing between cats and dogs which output a class output, not a number/scalar output of wait time.

Can someone give me an example using tensorflow of a CNN giving a scalar or regression output based on image recognition.

Thanks so much! I am honestly super stuck and am getting no progress and it has been over two weeks working on this same problem.

like image 987
Ic3MaN911 Avatar asked Aug 06 '17 03:08

Ic3MaN911


People also ask

Can CNNs be used for regression?

Convolutional neural networks (CNNs, or ConvNets) are essential tools for deep learning, and are especially suited for analyzing image data. For example, you can use CNNs to classify images. To predict continuous data, such as angles and distances, you can include a regression layer at the end of the network.

Can TensorFlow be used for regression?

TensorFlow 2.0 now uses Keras API as its default library for training classification and regression models.

Can ResNet be used for regression?

If by a ResNet architecture you mean a neural network with skip connections then yes, it can be used for any structured regression problem. If you mean the specific type of CNN that is used for image classification then no. That network is build with 2D convolution layers which require their input to be 2D as well.


1 Answers

Check out the Udacity self-driving-car models which take an input image from a dash cam and predict a steering angle (i.e. continuous scalar) to stay on the road...usually using a regression output after one or more fully connected layers on top of the CNN layers.

https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models

Here is a typical model:

https://github.com/udacity/self-driving-car/tree/master/steering-models/community-models/autumn

...it uses tf.atan() or you can use tf.tanh() or just linear to get your final output y.

Use MSE for your loss function.

Here is another example in keras...

model = models.Sequential()
model.add(convolutional.Convolution2D(16, 3, 3, input_shape=(32, 128, 3), activation='relu'))
model.add(pooling.MaxPooling2D(pool_size=(2, 2)))
model.add(convolutional.Convolution2D(32, 3, 3, activation='relu'))
model.add(pooling.MaxPooling2D(pool_size=(2, 2)))
model.add(convolutional.Convolution2D(64, 3, 3, activation='relu'))
model.add(pooling.MaxPooling2D(pool_size=(2, 2)))
model.add(core.Flatten())
model.add(core.Dense(500, activation='relu'))
model.add(core.Dropout(.5))
model.add(core.Dense(100, activation='relu'))
model.add(core.Dropout(.25))
model.add(core.Dense(20, activation='relu'))
model.add(core.Dense(1))
model.compile(optimizer=optimizers.Adam(lr=1e-04), loss='mean_squared_error')

They key difference from the MNIST examples is that instead of funneling down to a N-dim vector of logits into softmax w/ cross entropy loss, for your regression output you take it down to a 1-dim vector w/ MSE loss. (you can also have a mix of multiple classification and regression outputs in the final layer...like in YOLO object detection)

like image 157
j314erre Avatar answered Sep 30 '22 19:09

j314erre