I built the gpu version of the docker image https://github.com/floydhub/dl-docker with keras version 2.0.0 and tensorflow version 0.12.1. I then ran the mnist tutorial https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py but realized that keras is not using GPU. Below is the output that I have
root@b79b8a57fb1f:~/sharedfolder# python test.py
Using TensorFlow backend.
Downloading data from https://s3.amazonaws.com/img-datasets/mnist.npz
x_train shape: (60000, 28, 28, 1)
60000 train samples
10000 test samples
Train on 60000 samples, validate on 10000 samples
Epoch 1/12
2017-09-06 16:26:54.866833: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866855: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866863: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866870: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-09-06 16:26:54.866876: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Can anyone let me know if there are some settings that need to be made before keras uses GPU ? I am very new to all these so do let me know if I need to provide more information.
I have installed the pre-requisites as mentioned on the page
I am able to launch the docker image
docker run -it -p 8888:8888 -p 6006:6006 -v /sharedfolder:/root/sharedfolder floydhub/dl-docker:cpu bash
I am able to run the last step
cv@cv-P15SM:~$ cat /proc/driver/nvidia/version
NVRM version: NVIDIA UNIX x86_64 Kernel Module 375.66 Mon May 1 15:29:16 PDT 2017
GCC version: gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.4)
I am able to run the step here
# Test nvidia-smi
cv@cv-P15SM:~$ nvidia-docker run --rm nvidia/cuda nvidia-smi
Thu Sep 7 00:33:06 2017
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.66 Driver Version: 375.66 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce GTX 780M Off | 0000:01:00.0 N/A | N/A |
| N/A 55C P0 N/A / N/A | 310MiB / 4036MiB | N/A Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 Not Supported |
+-----------------------------------------------------------------------------+
I am also able to run the nvidia-docker command to launch a gpu supported image.
What I have tried
I have tried the following suggestions below
I appended the suggested lines to my bashrc and have verified that the bashrc file is updated.
echo 'export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/cuda-8.0/lib64:/usr/local/cuda-8.0/extras/CUPTI/lib64' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda-8.0' >> ~/.bashrc
To import the following commands in my python file
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" # see issue #152
os.environ["CUDA_VISIBLE_DEVICES"]="0"
Both steps, done separately or together unfortunately did not solve the issue. Keras is still running with the CPU version of tensorflow as its backend. However, I might have found the possible issue. I checked the version of my tensorflow via the following commands and found two of them.
This is the CPU version
root@08b5fff06800:~# pip show tensorflow
Name: tensorflow
Version: 1.3.0
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: tensorflow-tensorboard, six, protobuf, mock, numpy, backports.weakref, wheel
And this is the GPU version
root@08b5fff06800:~# pip show tensorflow-gpu
Name: tensorflow-gpu
Version: 0.12.1
Summary: TensorFlow helps the tensors flow
Home-page: http://tensorflow.org/
Author: Google Inc.
Author-email: [email protected]
License: Apache 2.0
Location: /usr/local/lib/python2.7/dist-packages
Requires: mock, numpy, protobuf, wheel, six
Interestingly, the output shows that keras is using tensorflow version 1.3.0 which is the CPU version and not 0.12.1, the GPU version
import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras import backend as K
import tensorflow as tf
print('Tensorflow: ', tf.__version__)
Output
root@08b5fff06800:~/sharedfolder# python test.py
Using TensorFlow backend.
Tensorflow: 1.3.0
I guess now I need to figure out how to have keras use the gpu version of tensorflow.
If your system has an NVIDIA® GPU and you have the GPU version of TensorFlow installed then your Keras code will automatically run on the GPU.
TensorFlow code, and tf.keras models will transparently run on a single GPU with no code changes required. Note: Use tf.config.list_physical_devices('GPU') to confirm that TensorFlow is using the GPU. The simplest way to run on multiple GPUs, on one or many machines, is using Distribution Strategies.
allow_soft_placement allows for operations to be run on the CPU if any of the following criterion are met: there is no GPU implementation for the operation. there are no GPU devices known or registered.
TensorFlow GPU Operations TensorFlow refers to the CPU on your local machine as /device:CPU:0 and to the first GPU as /GPU:0—additional GPUs will have sequential numbering. By default, if a GPU is available, TensorFlow will use it for all operations.
It is never a good idea to have both tensorflow
and tensorflow-gpu
packages installed side by side (the one single time it happened to me accidentally, Keras was using the CPU version).
I guess now I need to figure out how to have keras use the gpu version of tensorflow.
You should simply remove both packages from your system, and then re-install tensorflow-gpu
[UPDATED after comment]:
pip uninstall tensorflow tensorflow-gpu
pip install tensorflow-gpu
Moreover, it is puzzling why you seem to use the floydhub/dl-docker:cpu
container, while according to the instructions you should be using the floydhub/dl-docker:gpu
one...
I had similar kind of issue - keras didn't use my GPU. I had tensorflow-gpu installed according to instruction into conda, but after installation of keras it simply not listed GPU as available device. I've realized that installation of keras adds tensorflow package! So I had both tensorflow and tensorflow-gpu packages. I've found that there is keras-gpu package available. After complete uninstallation of keras, tensorflow, tensorflow-gpu and installation of tensorflow-gpu, keras-gpu the problem was solved.
In the future, you can try using virtual environments to separate tensorflow CPU and GPU, for example:
conda create --name tensorflow python=3.5
activate tensorflow
pip install tensorflow
AND
conda create --name tensorflow-gpu python=3.5
activate tensorflow-gpu
pip install tensorflow-gpu
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With