.predict() runs only on CPU even though GPU is available

Tags:

I used this script to train a model & predict on a machine with GPU installed and enabled and it seems that it's using only the CPU in the prediction stage.

The device placement log I'm seeing during the .predict() part is the following:

2020-09-01 06:08:19.085400: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op RangeDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.085617: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.089558: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op MapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.090003: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op PrefetchDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097064: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op FlatMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097647: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op TensorDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097802: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op RepeatDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.097957: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op ZipDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.101284: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op ParallelMapDataset in device /job:localhost/replica:0/task:0/device:CPU:0
2020-09-01 06:08:19.101865: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op ModelDataset in device /job:localhost/replica:0/task:0/device:CPU:0

even though that when I run:

print(tf.config.experimental.list_physical_devices('GPU'))

I receive:

[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:1', device_type='GPU'), PhysicalDevice(name='/physical_device:GPU:2', device_type='GPU')]

The code I used can be found here. The full output logs can be seen here.

More context:
Python: 3.7.7
Tensorflow: 2.1.0
GPU: Nvidia Tesla V100-PCIE-16GB
CPU: Intel Xeon Gold 5218 CPU @ 2.30GHz
RAM: 394851272 KB
OS: Linux

248

asked Sep 01 '20 06:09

georgehdd

4 Answers

Since you already have a GPU, I assume that tf.test.is_gpu_available() returns True. You can use this piece of code to force TensorFlow to use a specific device-

with tf.device('/gpu:0'):
    // GPU stuff

This also works if you want to force it to use a CPU instead for some part of the code-

with tf.device('/cpu:0'):
    // CPU stuff

An addon which might be helpful while using tf.device(), you can use this function to list all the devices you have-

def get_available_devices():
    local_device_protos = device_lib.list_local_devices()
    return [x.name for x in local_device_protos]

get_available_devices()

Though for the use-case you mentioned, I do not guarantee faster inferences with a GPU.

165

answered Oct 23 '22 22:10

Rishit Dagli

Sounds like you need to use a Distributed Strategy per the docs. Your code would then become something like the following:

tf.debugging.set_log_device_placement(True)
strategy = tf.distribute.MirroredStrategy()

with strategy.scope():
    model = keras.Sequential(
        [
            keras.layers.Flatten(input_shape=(28, 28)),
            keras.layers.Dense(128, activation='relu'),
            keras.layers.Dense(10)
        ]
    )
    model.compile(
        optimizer='adam', 
        loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True), 
        metrics=['accuracy']
    )
    model.fit(train_images, train_labels, epochs=10)

    test_loss, test_acc = model.evaluate(test_images,  test_labels, verbose=2)
    probability_model = tf.keras.Sequential(
        [model, tf.keras.layers.Softmax()]
    )
    probability_model.predict(test_images)

Per the documentation, The best practice for using multiple GPUs is to use tf.distribute.Strategy.

answered Oct 23 '22 23:10

gold_cy

Your predict function is using GPU. And, I have recalculated timing on NVIDIA 1080 GTX with your code & it is taking 100 ms for inference.

Either reboot the system or check if GPU is getting utilised or not.

Here is the line of your code stating inference is run on GPU:

2020-09-01 06:19:15.885778: I tensorflow/core/common_runtime/eager/execute.cc:573] Executing op __inference_distributed_function_58022 in device /job:localhost/replica:0/task:0/device:GPU:0

answered Oct 23 '22 23:10

Anchal Gupta

Are you using the correct tensorflow package? It could help to uninstall tensorflow and install tensorflow-gpu instead.

For documentation see: https://www.tensorflow.org/install/gpu

answered Oct 23 '22 22:10

Y.Ynot

Related questions
                            
                                Python parallel execution with selenium
                            
                                How to write pyspark dataframe to HDFS and then how to read it back into dataframe?
                            
                                Understanding LSTM model using tensorflow for sentiment analysis
                            
                                numpy astype from float32 to float16
                            
                                How to handle large amouts of data in tensorflow?
                            
                                django aws s3 image resize on upload and access to various resized image
                            
                                What is the difference between @jit and @vectorize in numba?
                            
                                Does asyncio from python support coroutine-based API for UDP networking?
                            
                                How does Python determine if two strings are identical
                            
                                How do you add folding to QsciLexerCustom subclass?
                            
                                Could not install packages due to an EnvironmentError: [Errno 2]
                            
                                Bash Operator error: No such file or directory in airflow
                            
                                How to use a prototbuf map in Python?
                            
                                Python re for custom sequence type
                            
                                How to add syntax highlight to SQL line magic, cell magic and custom command in jupyter notebook?
                            
                                floor and ceil with number of decimals
                            
                                Graceful shutdown of uvicorn starlette app with websockets
                            
                                Is there a way to create a unique id over 2 fields?
                            
                                Detect multiple rectangles in image
                            
                                How can I combine ImageDataGenerator with TensorFlow datasets in TF2?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

.predict() runs only on CPU even though GPU is available

Tags:

python

tensorflow

gpu

keras