Logo Questions Linux Laravel Mysql Ubuntu Git Menu

Why does TensorFlow always use GPU 0?

I hit a problem when running TensorFlow inference on multiple-GPU setups.

Environment: Python 3.6.4; TensorFlow 1.8.0; Centos 7.3; 2 Nvidia Tesla P4

Here is the nvidia-smi output when the system is free:

Tue Aug 28 10:47:42 2018
| NVIDIA-SMI 384.81                 Driver Version: 384.81                    |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla P4            Off  | 00000000:00:0C.0 Off |                    0 |
| N/A   38C    P0    22W /  75W |      0MiB /  7606MiB |      0%      Default |
|   1  Tesla P4            Off  | 00000000:00:0D.0 Off |                    0 |
| N/A   39C    P0    23W /  75W |      0MiB /  7606MiB |      0%      Default |

| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No running processes found                                                 |

The key statements related to my issue:

os.environ["CUDA_VISIBLE_DEVICES"] = "0,1"

def get_sess_and_tensor(ckpt_path):
    assert os.path.exists(ckpt_path), "file: {} not exist.".format(ckpt_path)
    graph = tf.Graph()
    with graph.as_default():
        od_graph_def = tf.GraphDef()
        with tf.gfile.GFile(ckpt_path, "rb") as fid1:
            tf.import_graph_def(od_graph_def, name="")
        sess = tf.Session(graph=graph)
    with tf.device('/gpu:1'):
        tensor = graph.get_tensor_by_name("image_tensor:0")
        boxes = graph.get_tensor_by_name("detection_boxes:0")
        scores = graph.get_tensor_by_name("detection_scores:0")
        classes = graph.get_tensor_by_name('detection_classes:0')

    return sess, tensor, boxes, scores, classes

So, the problem is, when set I visible devices to '0,1', even if I set tf.device to GPU 1, when running inference, I see from nvidia-smi that only GPU 0 is used (GPU 0's GPU-Util is high – almost 100% – whereas GPU 1's is 0). Why doesn't it use GPU 1?

I want to use the two GPUs in parallel, but even with the following code, it still uses only GPU 0:

with tf.device('/gpu:0'):
    tensor = graph.get_tensor_by_name("image_tensor:0")
    boxes = graph.get_tensor_by_name("detection_boxes:0")
with tf.device('/gpu:1'):
    scores = graph.get_tensor_by_name("detection_scores:0")
    classes = graph.get_tensor_by_name('detection_classes:0')

Any suggestions are greatly appreciated.



like image 379
Wesley Avatar asked Aug 28 '18 05:08


People also ask

Does TensorFlow use all default GPU?

By default, TensorFlow maps nearly all of the GPU memory of all GPUs (subject to CUDA_VISIBLE_DEVICES ) visible to the process. This is done to more efficiently use the relatively precious GPU memory resources on the devices by reducing memory fragmentation. To limit TensorFlow to a specific set of GPUs, use the tf.

Will TensorFlow automatically use GPU?

If a TensorFlow operation has both CPU and GPU implementations, TensorFlow will automatically place the operation to run on a GPU device first. If you have more than one GPU, the GPU with the lowest ID will be selected by default. However, TensorFlow does not place operations into multiple GPUs automatically.

Does keras automatically use GPU?

If your system has an NVIDIA® GPU and you have the GPU version of TensorFlow installed then your Keras code will automatically run on the GPU.

1 Answers

The device names might be different depending on your setup.


from tensorflow.python.client import device_lib

And try using the device name for your second GPU exactly as listed there.

like image 108
Amila Avatar answered Oct 04 '22 01:10
