Limit GPU devices in Tensorflow

Tags:

I am developing in Python an application which uses Tensorflow and another model which with GPUs. I have a PC with many GPUs (3xNVIDIA GTX1080), due to the fact that all models try to use all available GPUs, resulting in OUT_OF_MEMORY_ERROR, I have found that you can assign a specific GPU to a Python script with

os.environ['CUDA_VISIBLE_DEVICES'] = '1'

Here I attach a snippet of my FCN class

class FCN:
  def __init__(self):
    os.environ['CUDA_VISIBLE_DEVICES'] = '1'
    self.keep_probability = tf.placeholder(tf.float32, name="keep_probabilty")
    self.image = tf.placeholder(tf.float32, shape=[None, IMAGE_SIZE, IMAGE_SIZE, 3], name="input_image")
    self.annotation = tf.placeholder(tf.int32, shape=[None, IMAGE_SIZE, IMAGE_SIZE, 1], name="annotation")

    self.pred_annotation, logits = inference(self.image, self.keep_probability)
    tf.summary.image("input_image", self.image, max_outputs=2)
    tf.summary.image("ground_truth", tf.cast(self.annotation, tf.uint8), max_outputs=2)
    tf.summary.image("pred_annotation", tf.cast(self.pred_annotation, tf.uint8), max_outputs=2)
    self.loss = tf.reduce_mean((tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits,
                                                                               labels=tf.squeeze(self.annotation,
                                                                                                 squeeze_dims=[3]),
                                                                               name="entropy")))
    tf.summary.scalar("entropy", self.loss)

...

Inside the same file FCN.py, I have a little main which uses the class and when Tensorflow prints the output I can see that only 1 GPU is used, as I expect.

if __name__ == "__main__":
  fcn = FCN()
  fcn.train_model()

  images_dir = '/home/super/datasets/MeterDataset/full-dataset-gas-images/'
  for img_file in os.listdir(images_dir):
    fcn.segment(os.path.join(images_dir, img_file))

Output:

2018-01-09 11:31:57.351029: I tensorflow/core/common_runtime/gpu/gpu_device.cc:955] Found device 0 with properties: 
name: GeForce GTX 1080
major: 6 minor: 1 memoryClockRate (GHz) 1.7335
pciBusID 0000:09:00.0
Total memory: 7.92GiB
Free memory: 7.60GiB
2018-01-09 11:31:57.351047: I tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 
2018-01-09 11:31:57.351051: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y 
2018-01-09 11:31:57.351057: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:09:00.0)

The problem arises when I try to instantiate the FCN object from another script.

def main(args):
  start_time = datetime.now()

  font = cv2.FONT_HERSHEY_SIMPLEX

  results_file = "../results.txt"
  if os.path.exists(results_file):
    os.remove(results_file)

  results_file = open(results_file, "a")

  fcn = FCN()

Here the creation of the object always uses all 3 GPUs instead of using the only assigned into the __init__() method.

Here the undesired output:

2018-01-09 11:41:02.537548: I 

tensorflow/core/common_runtime/gpu/gpu_device.cc:976] DMA: 0 1 2 
2018-01-09 11:41:02.537555: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 0:   Y Y Y 
2018-01-09 11:41:02.537558: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 1:   Y Y Y 
2018-01-09 11:41:02.537561: I tensorflow/core/common_runtime/gpu/gpu_device.cc:986] 2:   Y Y Y 
2018-01-09 11:41:02.537567: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX 1080, pci bus id: 0000:0b:00.0)
2018-01-09 11:41:02.537571: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:1) -> (device: 1, name: GeForce GTX 1080, pci bus id: 0000:09:00.0)
2018-01-09 11:41:02.537574: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1045] Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX 1080, pci bus id: 0000:05:00.0)

582

asked Jan 09 '18 10:01

caleale90

1 Answers

Here's what you can do:

Run your script with CUDA_VISIBLE_DEVICES environment variable already setup, as discussed here:
```
CUDA_VISIBLE_DEVICES=1 python another_script.py
```
Provide an explicit configuration to the Session constructor:
```
config = tf.ConfigProto(device_count={'GPU': 1})
sess = tf.Session(config=config)
```
... to force tensorflow use only one GPU, not matter how many there are available. You can also set fine-grained list of devices via visible_device_list (see config.proto for the details).

answered Sep 18 '22 00:09

Maxim

Related questions
                            
                                Python - Pandas - DataFrame - Explode single column into multiple boolean columns based on conditions
                            
                                Number of CNN learnable parameters - Python / TensorFlow
                            
                                Creating django form with null and blank field
                            
                                Python Curses - Detecting the Backspace Key
                            
                                Python dynamic import methods from file
                            
                                How to specify numba jitclass when the class's attribute contains another class instance?
                            
                                ValueError: invalid literal for int() with base 10: '196.41'
                            
                                Logistic Regression Gradient Descent [closed]
                            
                                reflecting every schema from postgres DB using SQLAlchemy
                            
                                How to upgrade to the latest Anaconda 5.0.1
                            
                                Remove add another from django admin
                            
                                How to display all images in a directory with flask [duplicate]
                            
                                Filter a 2D numpy array
                            
                                Sort CSV by column name
                            
                                Is there a function similar to OpenCV findContours that detects curves and replaces points with a spline?
                            
                                Django build video website similar to YouTube
                            
                                Python: How to Remove mouseCallback in OpenCV
                            
                                Scrapy: downloader/response_count vs response_received_count
                            
                                Boxplots with multiple categories with seaborn
                            
                                python (boto3) program to delete old snapshots in aws

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Limit GPU devices in Tensorflow

Tags:

python

tensorflow

python-2.7

cuda

gpu

caleale90

People also ask

1 Answers

Maxim

Recent Activity

Donate For Us