Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TensorFlow model gets loss 0

import tensorflow as tf
import numpy as np
def weight(shape):
return tf.Variable(tf.truncated_normal(shape, stddev=0.1))
def bias(shape):
return tf.Variable(tf.constant(0.1, shape=shape))
def output(input,w,b):
return tf.matmul(input,w)+b
x_columns = 33
y_columns = 1
layer1_num = 7
layer2_num = 7
epoch_num = 10
train_num = 1000
batch_size = 100
display_size = 1
x = tf.placeholder(tf.float32,[None,x_columns])
y = tf.placeholder(tf.float32,[None,y_columns])

layer1 = 
tf.nn.relu(output(x,weight([x_columns,layer1_num]),bias([layer1_num])))
layer2=tf.nn.relu
(output(layer1,weight([layer1_num,layer2_num]),bias([layer2_num])))
prediction = output(layer2,weight([layer2_num,y_columns]),bias([y_columns]))

loss=tf.reduce_mean
(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=prediction))
train_step = tf.train.AdamOptimizer().minimize(loss)

sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
for epoch in range(epoch_num):
   avg_loss = 0.
   for i in range(train_num):
      index = np.random.choice(len(x_train),batch_size)
      x_train_batch = x_train[index]
      y_train_batch = y_train[index]
      _,c = sess.run([train_step,loss],feed_dict=
{x:x_train_batch,y:y_train_batch})
      avg_loss += c/train_num
   if epoch % display_size == 0:
      print("Epoch:{0},Loss:{1}".format(epoch+1,avg_loss))
print("Training Finished")

My model gets Epoch:2,Loss:0.0 Epoch:3,Loss:0.0 Epoch:4,Loss:0.0 Epoch:5,Loss:0.0 Epoch:6,Loss:0.0 Epoch:7,Loss:0.0 Epoch:8,Loss:0.0 Epoch:9,Loss:0.0 Epoch:10,Loss:0.0 Training Finished

How can I deal with this problem?

like image 226
yoshi Avatar asked May 04 '17 07:05

yoshi


1 Answers

softmax_cross_entropy_with_logits expects labels in one-hot form, i.e. with a shape [batch_size, num_classes] . Here, you have y_columns = 1, which means only 1 class, which is necessarily always both the predicted one and the 'ground truth' (from your network's point of view), so your output is always correct no matter what the weights are. Hence, loss=0.

I guess you do have different classes, and y_train contains the ID of the label. Then predictions should be of shape [batch_size, num_classes], and instead of softmax_cross_entropy_with_logits you should use tf.nn.sparse_softmax_cross_entropy_with_logits

like image 155
gdelab Avatar answered Sep 28 '22 08:09

gdelab