Setup:
Everything works fine, but I run out of video memory on large models when I increase the batch size to speed up training. I figure moving to a 4 GPU system would in theory either improve total memory available or allow smaller batches to build faster, but observing the the nvidia stats, I can see only one GPU is used by default:
+------------------------------------------------------+
| NVIDIA-SMI 361.42 Driver Version: 361.42 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GRID K520 Off | 0000:00:03.0 Off | N/A |
| N/A 44C P0 45W / 125W | 3954MiB / 4095MiB | 94% Default |
+-------------------------------+----------------------+----------------------+
| 1 GRID K520 Off | 0000:00:04.0 Off | N/A |
| N/A 28C P8 17W / 125W | 11MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 2 GRID K520 Off | 0000:00:05.0 Off | N/A |
| N/A 32C P8 17W / 125W | 11MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
| 3 GRID K520 Off | 0000:00:06.0 Off | N/A |
| N/A 29C P8 17W / 125W | 11MiB / 4095MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 9862 C python34 3941MiB |
I know with raw Theano you can use manually multiple GPU's explicitly. Does Keras support use of multiple GPU's? If so, does it abstract it or do you need to map the GPU's to devices as in Theano and explicitly marshall computations to specific GPU's?
Using this API, you must: Instantiate a MirroredStrategy. During this process you have the option of configuring specific devices or using the default, which uses all available GPUs. With your strategy object, open a scope and create any Keras objects and variables needed.
Once multiple GPUs are added to your systems, you need to build parallelism into your deep learning processes. There are two main methods to add parallelism—models and data. Model parallelism is a method you can use when your parameters are too large for your memory constraints.
If you have more than one GPU, the GPU with the lowest ID will be selected by default. However, TensorFlow does not place operations into multiple GPUs automatically. To override the device placement to use multiple GPUs, we manually specify the device that a computation node should run on.
Can You Run Keras Models on GPU? GPUs are commonly used for deep learning, to accelerate training and inference for computationally intensive models. Keras is a Python-based, deep learning API that runs on top of the TensorFlow machine learning platform, and fully supports GPUs.
Multi-GPU training is experimental ("The code is rather new and is still considered experimental at this point. It has been tested and seems to perform correctly in all cases observed, but make sure to double-check your results before publishing a paper or anything of the sort.") and hasn't been integrated into Keras yet. However, you can use multiple GPUs with Keras with the Tensorflow backend: https://blog.keras.io/keras-as-a-simplified-interface-to-tensorflow-tutorial.html#multi-gpu-and-distributed-training.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With