Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does TensorFlow return [[nan nan]] instead of probabilities from a CSV file?

Here is the code that I am using. I'm trying to get a 1, 0, or hopefully a probability in result to a real test set. When I just split up the training set and run it on the training set I get a ~93% accuracy rate, but when I train the program and run it on the actual test set (the one without the 1's and 0's filling in column 1) it returns nothing but nan's.

import tensorflow as tf
import numpy as np
from numpy import genfromtxt
import sklearn

# Convert to one hot
def convertOneHot(data):
    y=np.array([int(i[0]) for i in data])
    y_onehot=[0]*len(y)
    for i,j in enumerate(y):
        y_onehot[i]=[0]*(y.max() + 1)
        y_onehot[i][j]=1
    return (y,y_onehot)


data = genfromtxt('cs-training.csv',delimiter=',')  # Training data
test_data = genfromtxt('cs-test-actual.csv',delimiter=',')  # Actual test data

#This part is to get rid of the nan's at the start of the actual test data
g = 0
for i in test_data:
    i[0] = 1
    test_data[g] = i
    g += 1

x_train=np.array([ i[1::] for i in data])
y_train,y_train_onehot = convertOneHot(data)

x_test=np.array([ i[1::] for i in test_data])
y_test,y_test_onehot = convertOneHot(test_data)
A=data.shape[1]-1 # Number of features, Note first is y
B=len(y_train_onehot[0])
tf_in = tf.placeholder("float", [None, A]) # Features
tf_weight = tf.Variable(tf.zeros([A,B]))
tf_bias = tf.Variable(tf.zeros([B]))
tf_softmax = tf.nn.softmax(tf.matmul(tf_in,tf_weight) + tf_bias)

# Training via backpropagation
tf_softmax_correct = tf.placeholder("float", [None,B])
tf_cross_entropy = -tf.reduce_sum(tf_softmax_correct*tf.log(tf_softmax))

# Train using tf.train.GradientDescentOptimizer
tf_train_step = tf.train.GradientDescentOptimizer(0.01).minimize(tf_cross_entropy)

# Add accuracy checking nodes
tf_correct_prediction = tf.equal(tf.argmax(tf_softmax,1), tf.argmax(tf_softmax_correct,1))
tf_accuracy = tf.reduce_mean(tf.cast(tf_correct_prediction, "float"))

saver = tf.train.Saver([tf_weight,tf_bias])

# Initialize and run
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

print("...")
# Run the training
for i in range(1):
    sess.run(tf_train_step, feed_dict={tf_in: x_train, tf_softmax_correct: y_train_onehot})
    #print y_train_onehot
    saver.save(sess, 'trained_csv_model')

    ans = sess.run(tf_softmax, feed_dict={tf_in: x_test})
    print ans

#Print accuracy
    #result = sess.run(tf_accuracy, feed_dict={tf_in: x_test, tf_softmax_correct: y_test_onehot})
#print result

When I print ans I get the following.

[[ nan  nan]
 [ nan  nan]
 [ nan  nan]
 ..., 
 [ nan  nan]
 [ nan  nan]
 [ nan  nan]]

I don't know what I'm doing wrong here. All I want is for ans to yield a 1, 0, or especially an array of probabilities where every unit inside the array has a length of 2.

I don't expect that many people are going to be able to answer this question for me, but please try at the very least. I'm stuck here waiting for a stroke of genius moment which hasn't come in 2 days now so I figured that I would ask. Thank you!

The test_data comes out looking like this-

[[  1.00000000e+00   8.85519080e-01   4.30000000e+01 ...,   0.00000000e+00
0.00000000e+00   0.00000000e+00]
 [  1.00000000e+00   4.63295269e-01   5.70000000e+01 ...,   4.00000000e+00
0.00000000e+00   2.00000000e+00]
 [  1.00000000e+00   4.32750360e-02   5.90000000e+01 ...,   1.00000000e+00
0.00000000e+00   2.00000000e+00]
 ..., 
 [  1.00000000e+00   8.15963730e-02   7.00000000e+01 ...,   0.00000000e+00
0.00000000e+00              nan]
 [  1.00000000e+00   3.35456547e-01   5.60000000e+01 ...,   2.00000000e+00
1.00000000e+00   3.00000000e+00]
 [  1.00000000e+00   4.41841663e-01   2.90000000e+01 ...,   0.00000000e+00
0.00000000e+00   0.00000000e+00]]

And the only reason that the first unit in the data is equal to 1 is because I got rid of the nan's that filled that position in order to avoid errors. Note that everything after the first column is a feature. The first column is what I'm trying to be able to predict.

EDIT:

I changed the code to the following-

import tensorflow as tf
import numpy as np
from numpy import genfromtxt
import sklearn
from sklearn.cross_validation import train_test_split
from tensorflow import Print

# Convert to one hot
def convertOneHot(data):
    y=np.array([int(i[0]) for i in data])
    y_onehot=[0]*len(y)
    for i,j in enumerate(y):
        y_onehot[i]=[0]*(y.max() + 1)
        y_onehot[i][j]=1
    return (y,y_onehot)


#buildDataFromIris()


data = genfromtxt('cs-training.csv',delimiter=',')  # Training data
test_data = genfromtxt('cs-test-actual.csv',delimiter=',')  # Test data

#for i in test_data[0]:
#    print i
#print test_data

#print test_data
g = 0
for i in test_data:
    i[0] = 1.
    test_data[g] = i
    g += 1

#print 1, test_data

x_train=np.array([ i[1::] for i in data])
y_train,y_train_onehot = convertOneHot(data)
#print len(x_train), len(y_train), len(y_train_onehot)

x_test=np.array([ i[1::] for i in test_data])
y_test,y_test_onehot = convertOneHot(test_data)
#for u in y_test_onehot[0]:
#    print u
#print y_test_onehot
#print len(x_test), len(y_test), len(y_test_onehot)
#print x_test[0]

#print '1'

#  A number of features, 4 in this example
#  B = 3 species of Iris (setosa, virginica and versicolor)
A=data.shape[1]-1 # Number of features, Note first is y
#print A
B=len(y_train_onehot[0])
#print B
#print y_train_onehot
tf_in = tf.placeholder("float", [None, A]) # Features
tf_weight = tf.Variable(tf.zeros([A,B]))
tf_bias = tf.Variable(tf.zeros([B]))
tf_softmax = tf.nn.softmax(tf.matmul(tf_in,tf_weight) + tf_bias)

tf_bias = tf.Print(tf_bias, [tf_bias], "Bias: ")
tf_weight = tf.Print(tf_weight, [tf_weight], "Weight: ")
tf_in = tf.Print(tf_in, [tf_in], "TF_in: ")
matmul_result = tf.matmul(tf_in, tf_weight)
matmul_result = tf.Print(matmul_result, [matmul_result], "Matmul: ")
tf_softmax = tf.nn.softmax(matmul_result + tf_bias)
print tf_bias
print tf_weight
print tf_in
print matmul_result

# Training via backpropagation
tf_softmax_correct = tf.placeholder("float", [None,B])
tf_cross_entropy = -tf.reduce_sum(tf_softmax_correct*tf.log(tf_softmax))

print tf_softmax_correct

# Train using tf.train.GradientDescentOptimizer
tf_train_step = tf.train.GradientDescentOptimizer(0.01).minimize(tf_cross_entropy)

# Add accuracy checking nodes
tf_correct_prediction = tf.equal(tf.argmax(tf_softmax,1), tf.argmax(tf_softmax_correct,1))
tf_accuracy = tf.reduce_mean(tf.cast(tf_correct_prediction, "float"))

print tf_correct_prediction
print tf_accuracy

#saver = tf.train.Saver([tf_weight,tf_bias])

# Initialize and run
init = tf.initialize_all_variables()
sess = tf.Session()
sess.run(init)

print("...")
prediction = []
# Run the training
#probabilities = []
#print y_train_onehot
#print '-----------------------------------------'
for i in range(1):
    sess.run(tf_train_step, feed_dict={tf_in: x_train, tf_softmax_correct: y_train_onehot})
    #print y_train_onehot
    #saver.save(sess, 'trained_csv_model')

    ans = sess.run(tf_softmax, feed_dict={tf_in: x_test})
    print ans

After the print out I see that one of the objects is Boolean. I don't know if that is the issue but take a look at the following and see if there is any way that you can help.

Tensor("Print_16:0", shape=TensorShape([Dimension(2)]), dtype=float32)
Tensor("Print_17:0", shape=TensorShape([Dimension(10), Dimension(2)]), dtype=float32)
Tensor("Print_18:0", shape=TensorShape([Dimension(None), Dimension(10)]), dtype=float32)
Tensor("Print_19:0", shape=TensorShape([Dimension(None), Dimension(2)]), dtype=float32)
Tensor("Placeholder_9:0", shape=TensorShape([Dimension(None), Dimension(2)]), dtype=float32)
Tensor("Equal_4:0", shape=TensorShape([Dimension(None)]), dtype=bool)
Tensor("Mean_4:0", shape=TensorShape([]), dtype=float32)
...
[[ nan  nan]
 [ nan  nan]
 [ nan  nan]
 ..., 
 [ nan  nan]
 [ nan  nan]
 [ nan  nan]]
like image 344
Ravaal Avatar asked Nov 25 '15 17:11

Ravaal


1 Answers

tf_cross_entropy = -tf.reduce_sum(tf_softmax_correct*tf.log(tf_softmax))

This was my problem on a project I was testing on. Specificaly it ended up being 0*log(0) which produces nan.

If you replace this with:

tf_cross_entropy = -tf.reduce_sum(tf_softmax_correct*tf.log(tf_softmax + 1e-50)) It should avoid the problem.

Ive also used reduce_mean rather than reduce_sum. If you double the batch size and use reduce_sum it will double the cost (and the magnitude of the gradient). In addition to that when using tf.print (which prints to the console tensorfow was started from) it makes it a bit more comparable when varying batch size.

Specifically this is what I'm using now when debugging:

cross_entropy = -tf.reduce_sum(y*tf.log(model + 1e-50)) ## avoid nan due to 0*log(0) cross_entropy = tf.Print(cross_entropy, [cross_entropy], "cost") #print to the console tensorflow was started from

like image 163
neuron Avatar answered Oct 02 '22 22:10

neuron