I'm following a tutorial and can walk through the code, which trains a neural network and evaluates its accuracy.
But I don't know how to use the trained model on a new single input (string) to predicts its label.
Can you advise how this might be done?
Tutorial:
https://medium.freecodecamp.org/big-picture-machine-learning-classifying-text-with-neural-networks-and-tensorflow-d94036ac2274
Session Code:
# Launch the graph
with tf.Session() as sess:
sess.run(init)
# Training cycle
for epoch in range(training_epochs):
avg_cost = 0.
total_batch = int(len(newsgroups_train.data)/batch_size)
# Loop over all batches
for i in range(total_batch):
batch_x,batch_y = get_batch(newsgroups_train,i,batch_size)
# Run optimization op (backprop) and cost op (to get loss value)
c,_ = sess.run([loss,optimizer], feed_dict={input_tensor: batch_x,output_tensor:batch_y})
# Compute average loss
avg_cost += c / total_batch
# Display logs per epoch step
if epoch % display_step == 0:
print("Epoch:", '%04d' % (epoch+1), "loss=", \
"{:.9f}".format(avg_cost))
print("Optimization Finished!")
# Test model
correct_prediction = tf.equal(tf.argmax(prediction, 1), tf.argmax(output_tensor, 1))
# Calculate accuracy
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
total_test_data = len(newsgroups_test.target)
batch_x_test,batch_y_test = get_batch(newsgroups_test,0,total_test_data)
print("Accuracy:", accuracy.eval({input_tensor: batch_x_test, output_tensor: batch_y_test}))
I have some experience with Python but basically no experience in Tensorflow.
An MLP is a typical example of a feedforward artificial neural network. In this figure, the ith activation unit in the lth layer is denoted as ai(l). The number of layers and the number of neurons are referred to as hyperparameters of a neural network, and these need tuning.
Each layer is represented as y = f(WxT + b). Where f is the activation function (covered below), W is the set of parameter, or weights, in the layer, x is the input vector, which can also be the output of the previous layer, and b is the bias vector.
To be able to use our textual data with an RNN, we need to transform it into numeric values. We then will create a sequence of characters as our X data and use the following character as our Y value. And lastly, we will transform our data into an array of booleans.
Tensorflow uses declarative style of programing. You need to declare what you want it to do, and only afterwards invoke it's run
or eval
functions.
1) if you want to do some interactive tinkering with your model, you need to have Session
handler open. Replace first lines with:
# Launch the graph
sess = tf.Session()
with sess.as_default():
.......
Original code closes the session, and you can not continue to use the trained model anymore.
Do not forget to call sess.close()
when you do not need it to release resources allocated to TF.
2) Now you have to convert the text you want to classify into numerical tensor representation. In original code it is done with get_batch()
. Follow the same pattern.
3) Declare the result. Your model is associated with variable prediction
.
4) Invoke TF. So the final code looks like:
texts = ['''By '8 grey level images' you mean 8 items of 1bit images?
It does work(!), but it doesn't work if you have more than 1bit
in your screen and if the screen intensity is non-linear.''',
'''Wanted: Shareware graphics display program for DOS.
Distribution: usa\nOrganization: University of Notre Dame, Notre Dame
Lines: 16 I need a graphics display program that can take as a parameter the name of
the file to be displayed, then just display that image and then quit.
All of the other graphics display programs come up with a menu first or some other silliness.
This program is going to be run from within another program. '''
]
# convert texts to tensors
batch = []
for text in texts:
vector = np.zeros(total_words,dtype=float)
for word in text.split(' '):
if word in word2index:
vector[word2index[word.lower()]] += 1
batch.append(vector)
x_in = np.array(batch)
# declare new Graph node variable
category = tf.argmax(prediction,1) # choose by maximum score
# run TF
with sess.as_default():
print("scores:", prediction.eval({input_tensor: x_in}))
print('class:', category.eval({input_tensor: x_in}))
Out[]:
scores: [[-785.557 -781.1719 105.238686]
[ 554.584 -532.36383 263.20908 ]]
class: [2 0]
First we need to convert the text to array:
def text_to_vector(text):
layer = np.zeros(total_words,dtype=float)
for word in text.split(' '):
layer[word2index[word.lower()]] += 1
return layer
# Convert text to vector so we can send it to our model
vector_txt = text_to_vector(text)
# Wrap vector like we do in get_batches()
input_array = np.array([vector_txt])
We can save and load models for reuse. We first create a Saver object and then save the session (after the model is trained):
saver = tf.train.Saver()
... train the model ...
save_path = saver.save(sess, "/tmp/model.ckpt")
In the example model the last "step" in the model architecture (i.e. the last thing done inside the multilayer_perceptron
method) is:
'out': tf.Variable(tf.random_normal([n_classes]))
So to get a prediction we get the index of the maximum value of this array (the predicted class):
saver = tf.train.Saver()
with tf.Session() as sess:
saver.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
classification = sess.run(tf.argmax(prediction, 1), feed_dict={input_tensor: input_array})
print("Predicted category:", classification)
You can check the whole code here: https://github.com/dmesquita/understanding_tensorflow_nn
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With