How to maximize CPU utilization in tensorflow GPU training with Dataset API?

Tags:

I am using the new dataset api to train a simple feed-forward DL model. I am interested in maximizing training speed. Since my network size isn't huge, as expected I see low GPU utilization. That is fine. But what I don't understand is why CPU usage is also far from 100%. I am using a multi-cpu/gpu core machine. Currently I get up to 140 steps / sec where batch_size = 128. If I cache the dataset I can get up to 210 steps (after initial scan). So I expect that with sufficient prefetching, I should be able to reach the same speed without caching. However with various prefetching and prefetch_to_device parameters, I cannot get more than 140 steps / sec. I also set num_parallel_calls to the number of cpu cores, which improves by about 20%.

Ideally I'd like the prefetching thread to be on a disjoint cpu core from the rest of the input pipeline, so that whatever benefit it provides is strictly additive. But from the cpu usage profiling I suspect that the prefetching and input processing occur on every core:

enter image description here

Is there a way to have more control over cpu allocation? I have tried prefetch(1), prefetch(500), and several other values (right after batch or at the end of the dataset construction), as well as in combination with prefetch_to_device(gpu_device, batch_size = None, 1, 500, etc). So far prefetch(500) without prefetch_to_device works the best.

Why doesn't prefetch try to exhaust all the cpu power on my machine? What are other possible bottlenecks in training speed?

Many thanks!

867

asked Jul 25 '18 03:07

John Jiang

1 Answers

The Dataset.prefetch(buffer_size) transformation adds pipeline parallelism and (bounded) buffering to your input pipeline. Therefore, increasing the buffer_size may increase the fraction of time when the input to the Dataset.prefetch() is running (because the buffer is more likely to have free space), but it does not increase the speed at which the input runs (and hence the CPU usage).

Typically, to increase the speed of the pipeline and increase CPU usage, you would add data parallelism by adding num_parallel_calls=N to any Dataset.map() transformations, and you might also consider using tf.contrib.data.parallel_interleave() to process many input sources concurrently and avoid blocking on I/O.

The tf.data Performance Guide has more details about how to improve the performance of input pipelines, including these suggestions.

142

answered Sep 28 '22 18:09

mrry

Related questions
                            
                                Edit Vader_lexicon.txt in nltk for python to add words related to my domain
                            
                                Bootstrap 4 grid | variable - fixed - variable | width column, overlaps on IE11
                            
                                Decoding OAuth2 JWT at API Gateway level vs at individual microservice level
                            
                                didSelectItemAt and didDeSelectItemAt from UICollectionView
                            
                                Regex Replace in the string with same value in multiple words
                            
                                Mongoose Querying Views
                            
                                How to use django authentication(django.contrib.auth) with my Android app
                            
                                SQLAlchemy - create table from yaml or a dictionary?
                            
                                Lynx UTF-8 support
                            
                                Parameters inside callback function in Javascript
                            
                                Why does streaming query with update output mode print out all rows?
                            
                                use underscore or lodash convert one JSON structure to another

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With