Tensorflow: Why my code is running slower and slower?

Question

I am new to tensorflow. The following code can run successfully, without any error. In the first 10 lines of output, the computation is fast, and the output (defined in the last line) flies line by line. However, as the iteration goes up, the computation become slower and slower, and finally become intolerable. So I wonder whether there are any modifications that can speed this up.

Here is a brief description of this code: This code apply the single hidden-layer neural network to the dataset. It aims to find the best parameter for rate[0] and rate[1], which are parameters that will effect the loss function. During each step of training, one tuple is fed to the model, and the accuracy of the tuple is immediately evaluated (this kind of data comes as a stream in real world).

import tensorflow as tf
import numpy as np

n_hidden=50
n_input=37
n_output=2
data_raw=np.genfromtxt(r'data.csv',delimiter=",",dtype=None)
data_info=np.genfromtxt(r'data2.csv',delimiter=",",dtype=None)

def pre_process( tuple):
    ans = []
    temp = [0 for i in range(24)]
    temp[int(tuple[0])] = 1
    # np.append(ans,np.array(temp))
    ans.extend(temp)
    temp = [0 for i in range(7)]
    temp[int(tuple[1]) - 1] = 1
    ans.extend(temp)
    # np.append(ans,np.array(temp))
    temp = [0 for i in range(3)]
    temp[int(tuple[3])] = 1
    ans.extend(temp)
    temp = [0 for i in range(2)]
    temp[int(tuple[4])] = 1
    ans.extend(temp)
    ans.extend([int(tuple[5])])
    return np.array(ans)

x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])
W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
b1=tf.Variable(tf.zeros([n_hidden]))
W2=tf.Variable(tf.zeros([n_hidden,n_output]))
b2=tf.Variable(tf.zeros([n_output]))

logits_1 = tf.matmul(x, W1) + b1
relu_layer= tf.nn.relu(logits_1)
logits_2 = tf.matmul(relu_layer, W2) + b2

correct_prediction = tf.equal(tf.argmax(logits_2,1), tf.argmax(y_,0))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

rate=[0,0]
for i in range(-100,200,10):
    rate[0]=i;
    for j in range(-100,i,10):
        rate[1]=j
        loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_, logits=logits_2)*[rate[0],rate[1]])
#       loss2=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_r, logits=logits_2)*[rate[2],rate[3]])
#       loss=loss1+loss2
        train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)
        data_line=1

        accur=0
        local_local=0
        remote_remote=0
        local_remote=0
        remote_local=0
        total=0
        with tf.Session() as sess:
            sess.run(tf.global_variables_initializer())
            for i in range(200):
#               print(int(data_raw[data_line][0]),data_info[i][0])
                if i>100:
                    total+=1
                if int(data_raw[data_line][0])==data_info[i][0]:
                    sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),y_:[1,0],y_r:[0,1]})
#                   print(sess.run(logits_2,{x:pre_process(data_info[i]).reshape(1,-1), y_: #[1,0]}))
                    data_line+=1;
                    if data_line==len(data_raw):
                        break
                    if i>100:
                        acc=accuracy.eval(feed_dict={x: pre_process(data_info[i]).reshape(1,-1), y_: [1,0], y_r:[0,1]})
                        local_local+=acc
                        local_remote+=1-acc
                        accur+=acc
                else:
                    sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),y_:[0,1], y_r:[1,0]})
#                   print(sess.run(logits_2,{x: pre_process(data_info[i]).reshape(1,-1), y_: #[0,1]}))
                    if i>100:
                        acc=accuracy.eval(feed_dict={x: pre_process(data_info[i]).reshape(1,-1), y_: [0,1], y_r:[1,0]})
                        remote_remote+=acc
                        remote_local+=1-acc
                        accur+=acc

        print("correctness: (%.3d,%.3d): 	%.2f   %.2f   %.2f   %.2f   %.2f" % (rate[0],rate[1],accur/total,local_local/total,local_remote/total,remote_local/total,remote_remote/total))

Prasad · Accepted Answer

Though GPhilo's answer addresses the issue why running the code is getting slower and slower, but in reality, that solution will result in creation of computation graph again and again which is not good.

The following two lines of code, (GPhilo has also mentioned) are continuously adding operations to your graph for each iteration.

loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits( \
                    labels=y_, logits=logits_2)*[rate[0],rate[1]])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

As I can see, you are having two values rate[0], rate[1] which needs to be supplied to your graph. Why are you not supplying these two values through placeholder and define your graph only once. Once you start running Session you shouldn't add more operations in your graph. Also, you shouldn't be considering initializing your Session for iteration.

Check this modified code (only important parts)

#  To clear previously created graph (if any) present in memory.
tf.reset_default_graph()   
x=tf.placeholder(tf.float32, shape=[1,n_input])
y_=tf.placeholder(tf.float32,shape=[n_output])
y_r=tf.placeholder(tf.float32,shape=[n_output])

# Add these two placeholders (Assuming they are single float value)
rate0 = tf.placeholder(tf.float32, shape = []) 
rate1 = tf.placeholder(tf.float32, shape = [])

W1=tf.Variable(tf.random_uniform([n_input, n_hidden]))
....
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

# Bring this code outside from loop (Note replacement of rate[0] with placeholder)
loss=tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(labels=y_, \
            logits=logits_2) * [rate0, rate1])
train_step = tf.train.GradientDescentOptimizer(0.01).minimize(loss)

# Instantiate session only once.
with tf.Session() as sess:
     sess.run(tf.global_variables_initializer())

     # Move the subsequent looping code inside.
     rate=[0,0]
     for i in range(-100,200,10):
        rate[0]=i;

After this modification, whenever your Session runs train_step, you need to supply these two extra placeholders in your feed_dict.

Ex:

sess.run(train_step,feed_dict={x:pre_process(data_info[i]).reshape(1,-1),
         y_:[1,0],y_r:[0,1], rate0: rate[0], rate1: rate[1]})

In this way, you will not be creating graph for every iteration and in fact this code will be faster than GPhilo's solution.

Tensorflow: Why my code is running slower and slower?

Tags:

performance

python

tensorflow

Jackie Wang

1 Answers

Prasad

Recent Activity

Donate For Us

Tensorflow: Why my code is running slower and slower?

Tags:

performance

python

tensorflow

Jackie Wang

1 Answers

Prasad

Related questions

Recent Activity

Donate For Us