Error using Tensorflow with GPU

Tags:

gpgpu

I've tried a bunch of different Tensorflow examples, which works fine on the CPU but generates the same error when I'm trying to run them on the GPU. One little example is this:

import tensorflow as tf

# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[2, 3], name='a')
b = tf.constant([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], shape=[3, 2], name='b')
c = tf.matmul(a, b)
# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))
# Runs the op.
print sess.run(c)

The error is always the same, CUDA_ERROR_OUT_OF_MEMORY:

I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcublas.so.7.0 locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcudnn.so.6.5 locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcufft.so.7.0 locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcuda.so locally
I tensorflow/stream_executor/dso_loader.cc:101] successfully opened CUDA library libcurand.so.7.0 locally
I tensorflow/core/common_runtime/local_device.cc:40] Local device intra op parallelism threads: 24
I tensorflow/core/common_runtime/gpu/gpu_init.cc:103] Found device 0 with properties: 
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:0a:00.0
Total memory: 11.25GiB
Free memory: 105.73MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:103] Found device 1 with properties: 
name: Tesla K80
major: 3 minor: 7 memoryClockRate (GHz) 0.8235
pciBusID 0000:0b:00.0
Total memory: 11.25GiB
Free memory: 133.48MiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:127] DMA: 0 1 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:137] 0:   Y Y 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:137] 1:   Y Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K80, pci bus id: 0000:0a:00.0)
I tensorflow/core/common_runtime/gpu/gpu_device.cc:702] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla K80, pci bus id: 0000:0b:00.0)
I tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:42] Allocating 105.48MiB bytes.
E tensorflow/stream_executor/cuda/cuda_driver.cc:932] failed to allocate 105.48M (110608384 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY
F tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:47] Check failed: gpu_mem != nullptr  Could not allocate GPU device memory for device 0. Tried to allocate 105.48MiB
Aborted (core dumped)

I guess that the problem has to do with my configuration rather than the memory usage of this tiny example. Does anyone have any idea?

Edit:

I've found out that the problem may be as simple as someone else running a job on the same GPU, which would explain the little amount of free memory. In that case: sorry for taking up your time...

931

asked Dec 29 '15 15:12

2 Answers

There appear to be two issues here:

By default, TensorFlow allocates a large fraction (95%) of the available GPU memory (on each GPU device) when you create a tf.Session. It uses a heuristic that reserves 200MB of GPU memory for "system" uses, but doesn't set this aside if the amount of free memory is smaller than that.
It looks like you have very little free GPU memory on either of your GPU devices (105.73MiB and 133.48MiB). This means that TensorFlow will attempt to allocate memory that should probably be reserved for the system, and hence the allocation fails.

Is it possible that you have another TensorFlow process (or some other GPU-hungry code) running while you attempt to run this program? For example, a Python interpreter with an open session—even if it is not using the GPU—will attempt to allocate almost the entire GPU memory.

Currently, the only way to restrict the amount of GPU memory that TensorFlow uses is the following configuration option (from this question):

# Assume that you have 12GB of GPU memory and want to allocate ~4GB:
gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

109

answered Oct 26 '22 13:10

mrry

This can happen because your TensorFlow session is not able to get sufficient amount of memory in the GPU. Maybe you have a low amount of free memory for other processes like TensorFlow or there is another TensorFlow session running in your system . so you have to configure the amount of memory the TensorFlow session will use

if you are using TensorFlow 1.x

gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.Session(config=tf.ConfigProto(gpu_options=gpu_options))

as Tensorflow 2.x has undergone major changes from 1.x.if you want to use TensorFlow 1.x versions method/function there is a compatibility module kept in TensorFlow 2.x. So TensorFlow 2.x user can use this piece of code

gpu_options = tf.compat.v1.GPUOptions(per_process_gpu_memory_fraction=0.333)

sess = tf.compat.v1.Session(config=tf.compat.v1.ConfigProto(gpu_options=gpu_options))

answered Oct 26 '22 12:10

ujjal das

Related questions
                            
                                When to use volatile with shared CUDA Memory
                            
                                Difference between kernels construct and parallel construct
                            
                                Doing readback from Direct3D textures and surfaces
                            
                                Is it worth offloading FFT computation to an embedded GPU?
                            
                                Why does my OpenCL kernel fail on the nVidia driver, but not Intel (possible driver bug)?
                            
                                In OpenCL, what does mem_fence() do, as opposed to barrier()?
                            
                                Processing camera feed data on GPU (metal) and CPU (OpenCV) on iPhone
                            
                                Financial applications on GPGPU
                            
                                How to calculate the speedup of a GPU program?
                            
                                In OpenCL, what is the difference between platform, context, and device?
                            
                                How are 2D / 3D CUDA blocks divided into warps?
                            
                                Is there memory protection on GPUs
                            
                                Basic GPU application, integer calculations
                            
                                Why does CUDA code run so much faster in NVIDIA Visual Profiler?
                            
                                Any Lisp extensions for CUDA?
                            
                                Double precision floating point in CUDA
                            
                                The variation of cache misses in GPU
                            
                                Continuous Integration Service for GPU package?
                            
                                How many threads (or work-item) can run at the same time?
                            
                                Numpy, BLAS and CUBLAS

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Error using Tensorflow with GPU

Tags:

tensorflow

gpgpu

user5654767

People also ask

2 Answers

mrry

ujjal das

Recent Activity

Donate For Us