In tensorflow
1.x, there is an option like use_unified_memory
and per_process_gpu_memory_fraction
which is potential to trigger CUDA UVM used. But how can this be done in tensorflow
2.0?
https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/protobuf/config.proto
// If true, uses CUDA unified memory for memory allocations. If
// per_process_gpu_memory_fraction option is greater than 1.0, then unified
// memory is used regardless of the value for this field. See comments for
// per_process_gpu_memory_fraction field for more details and requirements
// of the unified memory. This option is useful to oversubscribe memory if
// multiple processes are sharing a single GPU while individually using less
// than 1.0 per process memory fraction.
bool use_unified_memory = 2;
To limit TensorFlow to a specific set of GPUs, use the tf. config. set_visible_devices method. In some cases it is desirable for the process to only allocate a subset of the available memory, or to only grow the memory usage as is needed by the process.
If a TensorFlow operation has both CPU and GPU implementations, TensorFlow will automatically place the operation to run on a GPU device first. If you have more than one GPU, the GPU with the lowest ID will be selected by default. However, TensorFlow does not place operations into multiple GPUs automatically.
This is not on your NVIDIA GPU, and CUDA can't use it. Tensorflow can't use it when running on GPU because CUDA can't use it, and also when running on CPU because it's reserved for graphics. Even if CUDA could use it somehow.
This type of memory is what integrated graphics eg Intel HD series typically use. This is not on your NVIDIA GPU, and CUDA can't use it. Tensorflow can't use it when running on GPU because CUDA can't use it, and also when running on CPU because it's reserved for graphics. Even if CUDA could use it somehow.
On multi-GPU systems with pre-Pascal GPUs, if some of the GPUs have peer-to-peer access disabled, the memory will be allocated so it is initially resident on the CPU. 2 Strictly speaking, you can restrict visibility of an allocation to a specific CUDA stream by using cudaStreamAttachMemAsync ().
When code running on a CPU or GPU accesses data allocated this way (often called CUDA managed data), the CUDA system software and/or the hardware takes care of migrating memory pages to the memory of the accessing processor.
from tensorflow.compat.v1 import ConfigProto
from tensorflow.compat.v1 import InteractiveSession
config = ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 2
config.gpu_options.allow_growth = True
session = InteractiveSession(config=config)
If someone's looking to enable UVM in 1.x, just set the per_process_gpu_memory_fraction
over 1 (to any number you want).
use_unified_memory
doesn't do anything.
Another potential bug of TensorFlow: you may want to move your model definition to after you establish the session. Like
with tf.Session(GPUOptions...) as s:
model = xxx
s.run(model)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With