Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Profiling TensorFlow using tfprof

I am trying to profile computation/memory usage of TensorFlow and found that tfprof is a right tool for my purpose. However, I was not able to get FLOPS of all operators.

Here is what I did following the tfprof tutorial using cifar10 tutorial in TensorFlow repository (tensorflow/models/image/cifar10/cifar10_train.py):

run_metadata = tf.RunMetadata()

_, loss_value = sess.run([train_op, loss],
        options=tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE),
        run_metadata=run_metadata)

op_log = tfprof_log_pb2.OpLog()

// TODO: add op information

tf.contrib.tfprof.tfprof_logger.write_op_log(
        tf.get_default_graph(),
        log_dir="/tmp/log_dir",
        op_log=op_log,
        run_meta=run_metadata)

tf.contrib.tfprof.model_analyzer.print_model_analysis(
        tf.get_default_graph(),
        run_metadata=run_metadata,
        op_log=op_log,
        tfprof_options=tf.contrib.tfprof.model_analyzer.FLOAT_OPS_OPTIONS)

And the result is

Parsing GraphDef...
Parsing RunMetadata...
Parsing OpLog...
Preparing Views...

=========================Options=============================
-max_depth                  10000
-min_bytes                  0
-min_micros                 0
-min_params                 0
-min_float_ops              1
-device_regexes             .*
-order_by                   float_ops
-account_type_regexes       .*
-start_name_regexes         .*
-trim_name_regexes
-show_name_regexes          .*
-hide_name_regexes
-account_displayed_op_only  true
-select                     float_ops
-viz                        false
-dump_to_file

==================Model Analysis Report======================
_TFProfRoot (0/5.23b flops)
  conv2/Conv2D (3.77b/3.77b flops)
  conv1/Conv2D (707.79m/707.79m flops)
  gradients/local3/MatMul_grad/MatMul (226.49m/226.49m flops)
  gradients/local3/MatMul_grad/MatMul_1 (226.49m/226.49m flops)
  local3/MatMul (226.49m/226.49m flops)
  gradients/local4/MatMul_grad/MatMul (18.87m/18.87m flops)
  gradients/local4/MatMul_grad/MatMul_1 (18.87m/18.87m flops)
  local4/MatMul (18.87m/18.87m flops)
  conv1/BiasAdd (4.72m/4.72m flops)
  conv2/BiasAdd (1.18m/1.18m flops)
  gradients/softmax_linear/MatMul_grad/MatMul (491.52k/491.52k flops)
  gradients/softmax_linear/MatMul_grad/MatMul_1 (491.52k/491.52k flops)
  softmax_linear/MatMul (491.52k/491.52k flops)

======================End of Report==========================

However, the result does not contain all of the ops such as max pooling, relu, gradient of conv layers. Maybe flops stats of those ops are not defined (RegisterStatistics('flops')). Therefore, to provide runtime information, as in the tfprof tutorial 11), I tried to create OpLog (See code above).

However, I am not sure how can I add op information (How can I get entry name of the ops?). Is there any way to add ALL ops it contains?

Or any other tool rather than tfprof? Perhaps profiling tool from NVIDIA?

like image 232
enc Avatar asked Feb 17 '17 23:02

enc


1 Answers

You are right that the other ops don't have flops before they don't have RegisterStatistics('flops'). You are welcome to contribute.

I'm not sure if NVIDA has tools for it.

like image 150
Peter Avatar answered Sep 22 '22 06:09

Peter