I am currently running a simple script to train the mnist
dataset.
Running the training through my CPU via Tensorflow is giving me 49us/sample
and a 3e epoch using the following code:-
# CPU
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, epochs=3)
When I run the dataset through my AMD Pro 580 using the opencl_amd_radeon_pro_580_compute_engine
via plaidml setup I get the following results 249us/sample
with a 15s epoch, using the following code:-
# GPU
import plaidml.keras
plaidml.keras.install_backend()
import keras
from keras.datasets import mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = keras.utils.normalize(x_train, axis=1)
x_test = keras.utils.normalize(x_test, axis=1)
model = keras.models.Sequential()
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(128, activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam', loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model.fit(x_train, y_train, epochs=3)
I can see my CPU firing up for the CPU test and my GPU maxing out for the GPU test, but I am very confused as to why the CPU is out performing the GPU by a factor of 5.
Should this be the expected results?
Am I doing something wrong in my code?
It seems I've found the right solution at least for macOS/Keras/AMD GPU setup.
TL;DR:
*metal
instead.Here are the details:
Run plaidml-setup
and pickup metalš¤š»this is important!
...
Multiple devices detected (You can override by setting PLAIDML_DEVICE_IDS).
Please choose a default device:
1 : llvm_cpu.0
2 : metal_intel(r)_uhd_graphics_630.0
3 : metal_amd_radeon_pro_560x.0
Default device? (1,2,3)[1]:3
...
Make sure you saved changes:
Save settings to /Users/alexanderegorov/.plaidml? (y,n)[y]:y
Success!
Now run MNIST example, you should see something like:
INFO:plaidml:Opening device "metal_amd_radeon_pro_560x.0"
This is it. I have made a comparison using plaidbench keras mobilenet
:
metal_amd_radeon_pro_560x.0 FASTEST!
opencl_amd_amd_radeon_pro_560x_compute_engine.0
llvm_cpu.0
I think there are two aspects of the observed situation:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With