Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to calculate the flops from tfprof in tensorflow?

Tags:

tensorflow

how can i get the number of flops from tfprof i have the code as:

def calculate_flops():
    # Print to stdout an analysis of the number of floating point operations in the
    # model broken down by individual operations.
    param_stats = tf.contrib.tfprof.model_analyzer.print_model_analysis(
    tf.get_default_graph(),
    tfprof_options=tf.contrib.tfprof.model_analyzer.
    TRAINABLE_VARS_PARAMS_STAT_OPTIONS)
    print(param_stats)

but the results says flops = 0. how can i calculate the number of flops. can i have an example ?

like image 354
Bilal Avatar asked Nov 20 '17 08:11

Bilal


1 Answers

First of all, as of now, tfprof.model_analyzer.print_model_analysis is deprecated and tf.profiler.profile should be used instead according to the official documentation.

Given that we know the number of FLOP, we can get the FLOPS (FLOP per second) of a forward pass by measuring run time of a forward pass and divide FLOP/run_time

Let's take an easy example.

g = tf.Graph()
sess = tf.Session(graph=g)
with g.as_default():
    A = tf.Variable(initial_value=tf.random_normal([25, 16]))
    B = tf.Variable(initial_value=tf.random_normal([16, 9]))
    C = tf.matmul(A,B, name='output')
    sess.run(tf.global_variables_initializer())
    flops = tf.profiler.profile(g, options=tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP = ', flops.total_float_ops)

outputs 8288. But why do we get 8288 instead of the expected result 7200=2*25*16*9[a] ? The answer is in the way the tensors A and B are initialised. Initialising with a Gaussian distribution costs some FLOP. Changing the definition of A and B by

    A = tf.Variable(initial_value=tf.zeros([25, 16]))
    B = tf.Variable(initial_value=tf.zeros([16, 9]))

gives the expected output 7200.

Usually, a network's variables are initialised with Gaussian distributions among other schemes. Most of the time, we are not interested by the initialisation FLOP as they are done once during initialisation and do not happen during the training nor the inference. So, how could one get the exact number of FLOP disregarding the initialisation FLOP?

Freeze the graph with a pb.

The following snippet illustrates this:

import tensorflow as tf
from tensorflow.python.framework import graph_util

def load_pb(pb):
    with tf.gfile.GFile(pb, "rb") as f:
        graph_def = tf.GraphDef()
        graph_def.ParseFromString(f.read())
    with tf.Graph().as_default() as graph:
        tf.import_graph_def(graph_def, name='')
        return graph

# ***** (1) Create Graph *****
g = tf.Graph()
sess = tf.Session(graph=g)
with g.as_default():
    A = tf.Variable(initial_value=tf.random_normal([25, 16]))
    B = tf.Variable(initial_value=tf.random_normal([16, 9]))
    C = tf.matmul(A, B, name='output')
    sess.run(tf.global_variables_initializer())
    flops = tf.profiler.profile(g, options = tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP before freezing', flops.total_float_ops)
# *****************************        

# ***** (2) freeze graph *****
output_graph_def = graph_util.convert_variables_to_constants(sess, g.as_graph_def(), ['output'])

with tf.gfile.GFile('graph.pb', "wb") as f:
    f.write(output_graph_def.SerializeToString())
# *****************************


# ***** (3) Load frozen graph *****
g2 = load_pb('./graph.pb')
with g2.as_default():
    flops = tf.profiler.profile(g2, options = tf.profiler.ProfileOptionBuilder.float_operation())
    print('FLOP after freezing', flops.total_float_ops)

outputs

FLOP before freezing 8288
FLOP after freezing 7200

[a] Usually the FLOP of a matrix multiplication are mq(2p -1) for the product AB where A[m, p] and B[p, q] but TensorFlow returns 2mpq for some reason. An issue has been opened to understand why.

like image 63
BiBi Avatar answered Nov 15 '22 08:11

BiBi