I use a tensorflow to implement a simple multi-layer perceptron for regression. The code is modified from standard mnist classifier, that I only changed the output cost to MSE (use tf.reduce_mean(tf.square(pred-y))
), and some input, output size settings. However, if I train the network using regression, after several epochs, the output batch are totally the same. for example:
target: 48.129, estimated: 42.634 target: 46.590, estimated: 42.634 target: 34.209, estimated: 42.634 target: 69.677, estimated: 42.634 ......
I have tried different batch size, different initialization, input normalization using sklearn.preprocessing.scale (my inputs range are quite different). However, none of them worked. I have also tried one of sklearn example from Tensorflow (Deep Neural Network Regression with Boston Data). But I got another error in line 40:
'module' object has no attribute 'infer_real_valued_columns_from_input'
Anyone has clues on where the problem is? Thank you
My code is listed below, may be a little bit long, but very straghtforward:
from __future__ import absolute_import from __future__ import division from __future__ import print_function import tensorflow as tf from tensorflow.contrib import learn import matplotlib.pyplot as plt from sklearn.pipeline import Pipeline from sklearn import datasets, linear_model from sklearn import cross_validation import numpy as np boston = learn.datasets.load_dataset('boston') x, y = boston.data, boston.target X_train, X_test, Y_train, Y_test = cross_validation.train_test_split( x, y, test_size=0.2, random_state=42) total_len = X_train.shape[0] # Parameters learning_rate = 0.001 training_epochs = 500 batch_size = 10 display_step = 1 dropout_rate = 0.9 # Network Parameters n_hidden_1 = 32 # 1st layer number of features n_hidden_2 = 200 # 2nd layer number of features n_hidden_3 = 200 n_hidden_4 = 256 n_input = X_train.shape[1] n_classes = 1 # tf Graph input x = tf.placeholder("float", [None, 13]) y = tf.placeholder("float", [None]) # Create model def multilayer_perceptron(x, weights, biases): # Hidden layer with RELU activation layer_1 = tf.add(tf.matmul(x, weights['h1']), biases['b1']) layer_1 = tf.nn.relu(layer_1) # Hidden layer with RELU activation layer_2 = tf.add(tf.matmul(layer_1, weights['h2']), biases['b2']) layer_2 = tf.nn.relu(layer_2) # Hidden layer with RELU activation layer_3 = tf.add(tf.matmul(layer_2, weights['h3']), biases['b3']) layer_3 = tf.nn.relu(layer_3) # Hidden layer with RELU activation layer_4 = tf.add(tf.matmul(layer_3, weights['h4']), biases['b4']) layer_4 = tf.nn.relu(layer_4) # Output layer with linear activation out_layer = tf.matmul(layer_4, weights['out']) + biases['out'] return out_layer # Store layers weight & bias weights = { 'h1': tf.Variable(tf.random_normal([n_input, n_hidden_1], 0, 0.1)), 'h2': tf.Variable(tf.random_normal([n_hidden_1, n_hidden_2], 0, 0.1)), 'h3': tf.Variable(tf.random_normal([n_hidden_2, n_hidden_3], 0, 0.1)), 'h4': tf.Variable(tf.random_normal([n_hidden_3, n_hidden_4], 0, 0.1)), 'out': tf.Variable(tf.random_normal([n_hidden_4, n_classes], 0, 0.1)) } biases = { 'b1': tf.Variable(tf.random_normal([n_hidden_1], 0, 0.1)), 'b2': tf.Variable(tf.random_normal([n_hidden_2], 0, 0.1)), 'b3': tf.Variable(tf.random_normal([n_hidden_3], 0, 0.1)), 'b4': tf.Variable(tf.random_normal([n_hidden_4], 0, 0.1)), 'out': tf.Variable(tf.random_normal([n_classes], 0, 0.1)) } # Construct model pred = multilayer_perceptron(x, weights, biases) # Define loss and optimizer cost = tf.reduce_mean(tf.square(pred-y)) optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost) # Launch the graph with tf.Session() as sess: sess.run(tf.initialize_all_variables()) # Training cycle for epoch in range(training_epochs): avg_cost = 0. total_batch = int(total_len/batch_size) # Loop over all batches for i in range(total_batch-1): batch_x = X_train[i*batch_size:(i+1)*batch_size] batch_y = Y_train[i*batch_size:(i+1)*batch_size] # Run optimization op (backprop) and cost op (to get loss value) _, c, p = sess.run([optimizer, cost, pred], feed_dict={x: batch_x, y: batch_y}) # Compute average loss avg_cost += c / total_batch # sample prediction label_value = batch_y estimate = p err = label_value-estimate print ("num batch:", total_batch) # Display logs per epoch step if epoch % display_step == 0: print ("Epoch:", '%04d' % (epoch+1), "cost=", \ "{:.9f}".format(avg_cost)) print ("[*]----------------------------") for i in xrange(3): print ("label value:", label_value[i], \ "estimated value:", estimate[i]) print ("[*]============================") print ("Optimization Finished!") # Test model correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1)) # Calculate accuracy accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float")) print ("Accuracy:", accuracy.eval({x: X_test, y: Y_test}))
Deep learning offers several advantages over machine learning but can't replace it with simple problems. This article created regression models using both deep learning and simple machine learning algorithms. We saw that training a deep learning model might not be the best choice every time from the results.
You can "use" deep learning for regression. You have to consider the following: You can use a fully connected neural network for regression, just don't use any activation unit in the end (i.e. take out the RELU, sigmoid) and just let the input parameter flow-out (y=x).
Neural networks are flexible and can be used for both classification and regression.
Use of neural networks prediction in predictive analyticsNeural networks work better at predictive analytics because of the hidden layers. Linear regression models use only input and output nodes to make predictions. The neural network also uses the hidden layer to make predictions more accurate.
Right click on the file in the files pane and click 'download'. Use the code below. Alright, we've seen the fundamentals of building neural network regression models in TensorFlow. Let's step it up a notch and build a model for a more feature rich datase.
The code exposed will allow you to build a regression model, specify the categorical features and build your own activation function with Tensorflow. The data used corresponds to a Kaggle’s competition House Prices: Advanced Regression Techniques.
Before building a deep neural network model, start with linear regression using one and several variables. Begin with a single-variable linear regression to predict 'MPG' from 'Horsepower'.
Don’t forget to read the previous post on Getting Started With Tensorflow! Show your support by subscribing to our newsletter! Tensorflow was originally developed to construct the more complex neural networks used in tasks such as time-series analysis , word-embedding , image processing and reinforcement learning.
Short answer:
Transpose your pred
vector using tf.transpose(pred)
.
Longer answer:
The problem is that pred
(the predictions) and y
(the labels) are not of the same shape: one is a row vector and the other a column vector. Apparently when you apply an element-wise operation on them, you'll get a matrix, which is not what you want.
The solution is to transpose the prediction vector using tf.transpose()
to get a proper vector and thus a proper loss function. Actually, if you set the batch size to 1 in your example you'll see that it works even without the fix, because transposing a 1x1 vector is a no-op.
I applied this fix to your example code and observed the following behaviour. Before the fix:
Epoch: 0245 cost= 84.743440580 [*]---------------------------- label value: 23 estimated value: [ 27.47437096] label value: 50 estimated value: [ 24.71126747] label value: 22 estimated value: [ 23.87785912]
And after the fix at the same point in time:
Epoch: 0245 cost= 4.181439120 [*]---------------------------- label value: 23 estimated value: [ 21.64333534] label value: 50 estimated value: [ 48.76105118] label value: 22 estimated value: [ 24.27996063]
You'll see that the cost is much lower and that it actually learned the value 50 properly. You'll have to do some fine-tuning on the learning rate and such to improve your results of course.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With