Is Tensorflow Dataset API slower than Queues?

Tags:

I replaced CIFAR-10 preprocessing pipeline in the project with Dataset API approach and it resulted in performance decrease of about 10-20%.

Preporcessing is rather standart: - read image from disk - make random/crop and flip - shuffle, batch - feed to the model

Overall i see that batche processing is now 15% faster, but every once in a while (or, more precisely, whenever I reinitialize dataframe or expect reshuffling) the batch is being blocked for up long time (30 sec) which totals to slower epoch-per-epoch processing.

This behaviour seems to do something with internal hashing. If I reduce N in ds.shuffle(buffer_size=N) delays are shorter but proportionally more frequent. Removing shuffle at all results to delays as if buffer_size was set to dataset size.

Can somebody explain internal logic of Dataset API when it comes to reading/caching? Is there any reason at all to expect Dataset API to work faster than manually created Queues?

I am using TF 1.3.

990

asked Nov 21 '17 00:11

y.selivonchyk

1 Answers

If you implement the same pipeline using the tf.data.Dataset API and using queues, the performance of the Dataset version should be better than the queue-based version.

However, there are a few performance best practices to observe in order to get the best performance. We have collected these in a performance guide for tf.data. Here are the main issues:

Prefetching is important: the queue-based pipelines prefetch by default and the Dataset pipelines do not. Adding dataset.prefetch(1) to the end of your pipeline will give you most of the benefit of prefetching, but you might need to tune this further.
The shuffle operator has a delay at the beginning, while it fills its buffer. The queue-based pipelines shuffle a concatenation of all epochs, which means that the buffer is only filled once. In a Dataset pipeline, this would be equivalent to dataset.repeat(NUM_EPOCHS).shuffle(N). By contrast, you can also write dataset.shuffle(N).repeat(NUM_EPOCHS), but this needs to restart the shuffling in each epoch. The latter approach is slightly preferable (and truer to the definition of SGD, for example), but the difference might not be noticeable if your dataset is large.

We are adding a fused version of shuffle-and-repeat that doesn't incur the delay, and a nightly build of TensorFlow will include the custom tf.contrib.data.shuffle_and_repeat() transformation that is equivalent to dataset.shuffle(N).repeat(NUM_EPOCHS) but doesn't suffer the delay at the start of each epoch.

Having said this, if you have a pipeline that is significantly slower when using tf.data than the queues, please file a GitHub issue with the details, and we'll take a look!

146

answered Oct 19 '22 04:10

mrry

Related questions
                            
                                Oracle SQL Query Filter in JOIN ON vs WHERE
                            
                                Optimized/Best way for reading/writing a shared resoruce
                            
                                Difference between every pair of columns of two numpy arrays (how to do it more efficiently)?
                            
                                Java8 Lambda performance vs public functions
                            
                                Extensive use of LOH causes significant performance issue
                            
                                Is Map.containsKey() useful in a Map that has no null values?
                            
                                PHP too slow, can anyone see a way to make it faster?
                            
                                To ToList() or not to ToList()?
                            
                                Why is my for loop execution time not changing?
                            
                                How can I make an virtual scroll with angularJS?
                            
                                ERROR: array size exceeds the maximum allowed (1073741823)
                            
                                Does the metadata preload attribute on html 5 loads the entire video?
                            
                                Why is [].concat() faster than Array.prototype.concat()?
                            
                                Vectorized searchsorted numpy
                            
                                How to make props immutable to prevent rerender in React?
                            
                                Bash. The quickest and efficient array search
                            
                                Cross apply (select top 1) much slower than row_number()
                            
                                why is java taking long time initializing two dimensional arrays starting with the first dimension having a big size number?
                            
                                How to measure the quality of my code?
                            
                                Most performant way to filter a huge List in C#?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Is Tensorflow Dataset API slower than Queues?

Tags:

performance

tensorflow

tensorflow-datasets

y.selivonchyk

People also ask

1 Answers

mrry

Recent Activity

Donate For Us