Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorflow tf.data AUTOTUNE

I was reading the TF performance guide for Data Loading section. For prefetch it says,

The tf.data API provides a software pipelining mechanism through the tf.data.Dataset.prefetch transformation, which can be used to decouple the time when data is produced from the time when data is consumed. In particular, the transformation uses a background thread and an internal buffer to prefetch elements from the input dataset ahead of the time they are requested. The number of elements to prefetch should be equal to (or possibly greater than) the number of batches consumed by a single training step. You could either manually tune this value, or set it to tf.data.experimental.AUTOTUNE which will prompt the tf.data runtime to tune the value dynamically at runtime.

What is AUTOTUNE doing internally? Which algorithm, heuristics are being applied?

Additionally, in practice, what kind of manual tuning is done?

like image 802
dgumo Avatar asked Jun 15 '19 18:06

dgumo


People also ask

What is TF data Autotune?

data. AUTOTUNE , which will prompt the tf. data runtime to tune the value dynamically at runtime. Note that the prefetch transformation provides benefits any time there is an opportunity to overlap the work of a "producer" with the work of a "consumer."

Does TF data use GPU?

If a TensorFlow operation has both CPU and GPU implementations, by default, the GPU device is prioritized when the operation is assigned. For example, tf. matmul has both CPU and GPU kernels and on a system with devices CPU:0 and GPU:0 , the GPU:0 device is selected to run tf.

What does TF data dataset from_tensor_slices do?

With that knowledge, from_tensors makes a dataset where each input tensor is like a row of your dataset, and from_tensor_slices makes a dataset where each input tensor is column of your data; so in the latter case all tensors must be the same length, and the elements (rows) of the resulting dataset are tuples with one ...

How can TensorFlow be used to configure the dataset for performance?

Tensorflow and pre-trained model can be used to configure the dataset for performance using the 'AUTOTUNE' attribute that is present in the 'tf. Data' module. Buffered prefetching is used to ensure that the data can be taken from disk without having I/O become blocking.


1 Answers

tf.data builds a performance model of the input pipeline and runs an optimization algorithm to find a good allocation of its CPU budget across all parameters specified as AUTOTUNE. While the input pipeline is running, tf.data tracks the time spent in each operation, so that these times can be fed into the optimization algorithm.

The OptimizationOptions object gives some control over how autotune will behave.

like image 162
AAudibert Avatar answered Nov 23 '22 10:11

AAudibert