GPU only being used 1-5% Tensorflow-gpu and Keras

Tags:

I just installed tensorflow for gpu and am using keras for my CNN. During training my GPU is only used about 5%, but 5 out of 6gb of the vram is being used during the training. Sometimes it glitches, prints 0.000000e+00 in the console and the gpu goes to 100% but then after a few seconds the training slows back down to 5%. My GPU is the Zotac gtx 1060 mini and I am using a Ryzen 5 1600x.

Epoch 1/25
 121/3860 [..............................] - ETA: 31:42 - loss: 3.0575 - acc: 0.0877 - val_loss: 0.0000e+00 - val_acc: 0.0000e+00Epoch 2/25
 121/3860 [..............................] - ETA: 29:48 - loss: 3.0005 - acc: 0.0994 - val_loss: 0.0000e+00 - val_acc: 0.0000e+00Epoch 3/25
  36/3860 [..............................] - ETA: 24:47 - loss: 2.9863 - acc: 0.1024

668

asked Nov 23 '17 19:11

Thijs van der Heijden

Video Answer

1 Answers

Usually, we want the bottleneck to be on the GPU (hence 100% utilization). If that's not happening, some other part of your code is taking a long time during each batch processing. It's hard to say what is it (specialy because you didn't add any code), but there's a few things you can try:

1. input data

Make sure the input data for your network is always available. Reading images from disk takes a long time, so use multiple workers and the multiprocessing interface:

model.fit(..., use_multiprocessing=True, workers=8)

2. Force the model into the GPU

This is hardly the problem, because /gpu:0 is the default device, but it's worth to make sure you are executing the model in the intended device:

with tf.device('/gpu:0'):
    x = Input(...)
    y = Conv2D(..)
    model = Model(x, y)

2. Check the model's size

If your batch size is large and allowed soft placement, parts of your network (which didn't fit in the GPU's memory) might be placed at the CPU. This considerably slows down the process.

If soft placement is on, try to disable and check if a memory error is thrown:

# make sure soft-placement is off
tf_config = tf.ConfigProto(allow_soft_placement=False)
tf_config.gpu_options.allow_growth = True
s = tf.Session(config=tf_config)
K.set_session(s)

with tf.device(...):
    ...

model.fit(...)

If that's the case, try to reduce the batch size until it fits and give you good GPU usage. Then turn soft placement on again.

179

answered Oct 14 '22 05:10

ldavid

Related questions
                            
                                Plotting heatmap for 3 columns in python with seaborn
                            
                                How to locate and read Data Matrix code with python
                            
                                python astype(str) gives SettingWithCopyWarning and requests I use loc
                            
                                sqlalchemy.exc.UnboundExecutionError: Table object 'responsibles' is not bound to an Engine or Connection
                            
                                Dynamically generating elements of list within list
                            
                                Python Pandas Match Vlookup columns based on header values
                            
                                NumPy sum one array based on values in another array for each matching element in 3rd array
                            
                                Get *all* current jobs from python-rq
                            
                                Pythonic way to use the second condition in list comprehensions
                            
                                PyCharm: Unresolved reference with Scapy
                            
                                'DataFrame' object has no attribute 'melt'
                            
                                OpenCV image subtraction vs Numpy subtraction
                            
                                Append data to HDF5 file with Pandas, Python
                            
                                No module named 'Queue'
                            
                                How to remove every word with non alphabetic characters
                            
                                Django Migrations: Same migrations being created with makemigrations
                            
                                Difference between foo.bar() and bar(foo)?
                            
                                AWS Glue - Truncate destination postgres table prior to insert
                            
                                Which layers should I freeze for fine tuning a resnet model on keras?
                            
                                When to use re.compile

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

GPU only being used 1-5% Tensorflow-gpu and Keras

Tags:

python

tensorflow

deep-learning

keras

spyder

tensorflow-gpu

Thijs van der Heijden

People also ask

Video Answer

1 Answers

ldavid

Recent Activity

Donate For Us