Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Predict label of text with multi-layered perceptron model in Tensorflow

I'm following a tutorial and can walk through the code, which trains a neural network and evaluates its accuracy.

But I don't know how to use the trained model on a new single input (string) to predicts its label.

Can you advise how this might be done?

Tutorial:

https://medium.freecodecamp.org/big-picture-machine-learning-classifying-text-with-neural-networks-and-tensorflow-d94036ac2274

Session Code:

# Launch the graph
with tf.Session() as sess:
    sess.run(init)

    # Training cycle
    for epoch in range(training_epochs):
        avg_cost = 0.
        total_batch = int(len(newsgroups_train.data)/batch_size)
        # Loop over all batches
        for i in range(total_batch):
            batch_x,batch_y = get_batch(newsgroups_train,i,batch_size)
            # Run optimization op (backprop) and cost op (to get loss value)
            c,_ = sess.run([loss,optimizer], feed_dict={input_tensor: batch_x,output_tensor:batch_y})
            # Compute average loss
            avg_cost += c / total_batch
        # Display logs per epoch step
        if epoch % display_step == 0:
            print("Epoch:", '%04d' % (epoch+1), "loss=", \
                "{:.9f}".format(avg_cost))
    print("Optimization Finished!")

    # Test model
    correct_prediction = tf.equal(tf.argmax(prediction, 1), tf.argmax(output_tensor, 1))
    # Calculate accuracy
    accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
    total_test_data = len(newsgroups_test.target)
    batch_x_test,batch_y_test = get_batch(newsgroups_test,0,total_test_data)
    print("Accuracy:", accuracy.eval({input_tensor: batch_x_test, output_tensor: batch_y_test}))

I have some experience with Python but basically no experience in Tensorflow.

like image 971
tim_xyz Avatar asked May 25 '18 18:05

tim_xyz


People also ask

What is Multilayer Perceptron example?

An MLP is a typical example of a feedforward artificial neural network. In this figure, the ith activation unit in the lth layer is denoted as ai(l). The number of layers and the number of neurons are referred to as hyperparameters of a neural network, and these need tuning.

How do you calculate Multilayer Perceptron?

Each layer is represented as y = f(WxT + b). Where f is the activation function (covered below), W is the set of parameter, or weights, in the layer, x is the input vector, which can also be the output of the previous layer, and b is the bias vector.

How is the RNN used to generate a new text string?

To be able to use our textual data with an RNN, we need to transform it into numeric values. We then will create a sequence of characters as our X data and use the following character as our Y value. And lastly, we will transform our data into an array of booleans.


2 Answers

Tensorflow uses declarative style of programing. You need to declare what you want it to do, and only afterwards invoke it's run or eval functions.

1) if you want to do some interactive tinkering with your model, you need to have Session handler open. Replace first lines with:

# Launch the graph
sess = tf.Session()
with sess.as_default():
    .......

Original code closes the session, and you can not continue to use the trained model anymore. Do not forget to call sess.close() when you do not need it to release resources allocated to TF.

2) Now you have to convert the text you want to classify into numerical tensor representation. In original code it is done with get_batch(). Follow the same pattern.

3) Declare the result. Your model is associated with variable prediction.

4) Invoke TF. So the final code looks like:

texts = ['''By '8 grey level images' you mean 8 items of 1bit images?
It does work(!), but it doesn't work if you have more than 1bit
in your screen and if the screen intensity is non-linear.''',

'''Wanted: Shareware graphics display program for DOS.
Distribution: usa\nOrganization: University of Notre Dame, Notre Dame
Lines: 16 I need a graphics display program that can take as a parameter the name of
the file to be displayed, then just display that image and then quit.
All of the other graphics display programs come up with a menu first or some other silliness.
This program is going to be run from within another program.  '''       
        ]
# convert texts to tensors
batch = []
for text in texts:
    vector = np.zeros(total_words,dtype=float)
    for word in text.split(' '):
        if word in word2index:
            vector[word2index[word.lower()]] += 1
    batch.append(vector)

x_in = np.array(batch)

# declare new Graph node variable
category = tf.argmax(prediction,1) # choose by maximum score

# run TF
with sess.as_default():
    print("scores:", prediction.eval({input_tensor: x_in}))
    print('class:', category.eval({input_tensor: x_in}))


Out[]:
scores: [[-785.557    -781.1719    105.238686]
         [ 554.584    -532.36383   263.20908 ]]
class: [2 0] 
like image 130
igrinis Avatar answered Nov 14 '22 17:11

igrinis


First we need to convert the text to array:

def text_to_vector(text):
    layer = np.zeros(total_words,dtype=float)
    for word in text.split(' '):
        layer[word2index[word.lower()]] += 1

    return layer

# Convert text to vector so we can send it to our model
vector_txt = text_to_vector(text)
# Wrap vector like we do in get_batches()
input_array = np.array([vector_txt])

We can save and load models for reuse. We first create a Saver object and then save the session (after the model is trained):

saver = tf.train.Saver()
... train the model ...
save_path = saver.save(sess, "/tmp/model.ckpt")

In the example model the last "step" in the model architecture (i.e. the last thing done inside the multilayer_perceptron method) is:

'out': tf.Variable(tf.random_normal([n_classes]))

So to get a prediction we get the index of the maximum value of this array (the predicted class):

saver = tf.train.Saver()

with tf.Session() as sess:
    saver.restore(sess, "/tmp/model.ckpt")
    print("Model restored.")

    classification = sess.run(tf.argmax(prediction, 1), feed_dict={input_tensor: input_array})
    print("Predicted category:", classification)

You can check the whole code here: https://github.com/dmesquita/understanding_tensorflow_nn

like image 45
dehq Avatar answered Nov 14 '22 18:11

dehq