Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding tensorflow profiling results

This example shows how to profile tensorflow programs. I have used this tool to profile my program, a simple LSTM. And the results is shown as:

/gpu:0/stream:all Compute(pid 5)

MatMul_AllCompute

/job:localhost/replica:0/task:0/gpu:0 Compute(pid 3)

MatMul_GpuCompute

My question :

a)what is the meaning of each row.

b)Especially what is the difference between /gpu:0/stream:all Compute(pid 5) and /job:localhost/replica:0/task:0/gpu:0 Compute(pid 3).

c)Why their execution time are different, namely 0.072ms and 0.094ms.

like image 449
pgplus1628 Avatar asked Apr 12 '17 14:04

pgplus1628


People also ask

What is Profiler in TensorFlow?

Profiling helps understand the hardware resource consumption (time and memory) of the various TensorFlow operations (ops) in your model and resolve performance bottlenecks and, ultimately, make the model execute faster.

How do I optimize my CPU for TensorFlow?

Users can enable those CPU optimizations by setting the the environment variable TF_ENABLE_ONEDNN_OPTS=1 for the official x86-64 TensorFlow after v2. 5. Most of the recommendations work on both official x86-64 TensorFlow and Intel® Optimization for TensorFlow.

How do I limit GPU memory usage TensorFlow?

Limiting GPU memory growth To limit TensorFlow to a specific set of GPUs, use the tf. config. set_visible_devices method. In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as is needed by the process.

What is oom TensorFlow?

OOM (Out Of Memory) errors can occur when building and training a neural network model on the GPU. The size of the model is limited by the available memory on the GPU. The following may occur when a model has exhausted the memory : Resource Exhausted Error : an error message that indicates Out Of Memory (OOM)


1 Answers

Here's an update from one of the engineers:

The '/gpu:0/stream:*' timelsines are hardware tracing of CUDA kernel execution times.

The '/gpu:0' lines are the TF software device enqueueing the ops on the CUDA stream (usually takes almost zero time)

like image 111
Pete Warden Avatar answered Oct 26 '22 15:10

Pete Warden