I am developing a Tensorflow network based on their MNIST for beginners template. Basically, I am trying to implement a simple logistic regression in which 10 continuous variables predict a binary outcome, so my inputs are 10 values between 0 and 1, and my target variable (Y_train and Y_test in the code) is a 1 or 0.
My main problem is that there is no change in accuracy no matter how many training sets I run -- it is 0.276667 whether I run 100 or 31240 steps. Additionally, when I switch from the softmax to simply matmul to generate my Y values, I get 0.0 accuracy, which suggests there may be something wrong with my x*W + b calculation. The inputs read out just fine.
What I'm wondering is a) whether I'm not calculating Y values properly because of an error in my code and b) if that's not the case, is it possible that I need to implement the one_hot vectors -- even though my output already takes the form of 0 or 1. If the latter is the case, where do I include the one_hot=TRUE function in my generation of the target values vector? Thanks!
import numpy as np
import tensorflow as tf
train_data = np.genfromtxt("TRAINDATA2.txt", delimiter=" ")
train_input = train_data[:, :10]
train_input = train_input.reshape(31240, 10)
X_train = tf.placeholder(tf.float32, [31240, 10])
train_target = train_data[:, 10]
train_target = train_target.reshape(31240, 1)
Y_train = tf.placeholder(tf.float32, [31240, 1])
test_data = np.genfromtxt("TESTDATA2.txt", delimiter = " ")
test_input = test_data[:, :10]
test_input = test_input.reshape(7800, 10)
X_test = tf.placeholder(tf.float32, [7800, 10])
test_target = test_data[:, 10]
test_target = test_target.reshape(7800, 1)
Y_test = tf.placeholder(tf.float32, [7800, 1])
W = tf.Variable(tf.zeros([10, 1]))
b = tf.Variable(tf.zeros([1]))
Y_obt = tf.nn.softmax(tf.matmul(X_train, W) + b)
Y_obt_test = tf.nn.softmax(tf.matmul(X_test, W) + b)
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=Y_obt,
labels=Y_train)
train_step = tf.train.GradientDescentOptimizer(0.05).minimize(cross_entropy)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
for _ in range(31240):
sess.run(train_step, feed_dict={X_train: train_input,
Y_train:train_target})
correct_prediction = tf.equal(tf.round(Y_obt_test), Y_test)
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
print(sess.run(accuracy, feed_dict={X_test : test_input, Y_test:
test_target}))
Since you map your values to a target with one element, you should not use softmax cross entropy, since the softmax operation transforms the input into a probability distribution, with the sum of all probabilities equal to 1. Since your target has only one element, it will simply output 1 everytime, since this is the only possible way to transform the input into a probability distribution.
You should instead use tf.nn.sigmoid_cross_entropy_with_logits()
(which is used for binary classification) and also remove the softmax from Y_obt
and convert it into tf.sigmoid()
for Y_obt_test
.
Another way is to one-hot encode your targets and use a network with a two-element output. In this case, you should use tf.nn.softmax_cross_entropy_with_logits()
, but remove the tf.nn.softmax()
from Y_obt
, since the softmax cross entropy expects unscaled logits (https://www.tensorflow.org/api_docs/python/tf/nn/softmax_cross_entropy_with_logits). For the Y_obt_test
, you should of course not remove it in this case.
Another thing: It might also help to take the mean of the cross entropies with cross_entropy = tf.reduce_mean(tf.sigmoid_cross_entropy_...)
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With