Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to run classify_image on multiple GPU?

I want to run vectorization on images using multiple GPUs (for now my script use only one GPU). I have a list of images, graph and session. THe script's output is saved vector. My machine has 3 NVIDIA GPU. Environment: Ubuntu, python 3.7, Tensorflow 2.0 (with GPU support). Here is my code example (initialization session):

def load_graph(frozen_graph_filename):
     # We load the protobuf file from the disk and parse it to retrieve the
     # unserialized graph_def
     with tf.io.gfile.GFile(frozen_graph_filename, "rb") as f:
         graph_def = tf.compat.v1.GraphDef()
         graph_def.ParseFromString(f.read())
     # Then, we import the graph_def into a new Graph and returns it
     with tf.Graph().as_default() as graph:
         # The name var will prefix every op/nodes in your graph
         # Since we load everything in a new graph, this is not needed
         tf.import_graph_def(graph_def, name="")
     return graph

GRAPH = load_graph(os.path.join(settings.IMAGENET_PATH['PATH'], 'classify_image_graph_def.pb'))
config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.9
config.gpu_options.allow_growth = True
SESSION = tf.compat.v1.Session(graph=GRAPH, config=config)

After that I called run vectorization as:

sess = SESSION
for image_index, image in enumerate(image_list):
    with Image.open(image) as f:
        image_data = f.convert('RGB')
        feature_tensor = POOL_TENSOR
        feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
        feature_vector = np.squeeze(feature_set)
        outfile_name = os.path.basename(image) + ".vc"
        this_is_path = settings.VECTORS_DIR_PATH['PATH']
        out_path = os.path.join(this_is_path, outfile_name)
        np.savetxt(out_path, feature_vector, delimiter=',')

This worked example runs on first GPU 100 vectors in 29 seconds. So, I tried this distributed training method from Tensorflow docs to run on multiple GPUs:

mirorred_strategy = tf.distribute.MirorredStrategy()
with mirorred_strategy.scope():
    sess = SESSION
    # and here all the code from previous example after session:
    for image_index, image in enumerate(image_list):
        with Image.open(image) as f:
            image_data = f.convert('RGB')
            feature_tensor = POOL_TENSOR
            feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
            feature_vector = np.squeeze(feature_set)
            outfile_name = os.path.basename(image) + ".vc"
            this_is_path = settings.VECTORS_DIR_PATH['PATH']
            out_path = os.path.join(this_is_path, outfile_name)
            np.savetxt(out_path, feature_vector, delimiter=',')

After I checked the logs, I can conclude that Tensorflow has access to all three GPU. However, this does not change anything: when running, Tensorflow is stil using only the first GPU (100 vectors in 29 seconds). Another method I tried is I manually set each item to concrete GPU instance:

sess = SESSION
for image_index, image in enumerate(image_list):
    if image_index % 2 == 0:
        device_name = '/gpu:1'
    elif image_index % 3 == 0:
        device_name = '/gpu:2'
    else:
        device_name = '/gpu:0'
    with tf.device(device_name):
        with Image.open(image) as f:
            image_data = f.convert('RGB')
            feature_tensor = POOL_TENSOR
            feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
            feature_vector = np.squeeze(feature_set)
            outfile_name = os.path.basename(image) + ".vc"
            this_is_path = settings.VECTORS_DIR_PATH['PATH']
            out_path = os.path.join(this_is_path, outfile_name)
            np.savetxt(out_path, feature_vector, delimiter=',')

Monitoring this method I observe every GPU being used but no performance speedup is seen because Tensorflow is swapping from one GPU device to another. So, on first item GPU:0 will be used and GPU:1, GPU:2are just waiting, on second item GPU:1 will be working and GPU:0, GPU:2 will be waiting. I am also tried another Tensorflow strategy from tf docs - without any changes. Also tried to define tf.Session() inside the for loop - without success. And found this - but cannot make it work for my code.

My questions are:

1) If there a way to modify tf.distribute.MirorredStrategy() to make Tensorflow use all three GPU?

2) If answer on (1) is not, how can I run vectorization using all GPU power(maybe here exists async way for doing this or something)?

like image 476
Dmitriy Kisil Avatar asked Dec 09 '19 14:12

Dmitriy Kisil


People also ask

What is multi GPU training?

TensorFlow Multiple GPU TensorFlow is an open source framework, created by Google, that you can use to perform machine learning operations. The library includes a variety of machine learning and deep learning algorithms and models that you can use as a base for your training.


1 Answers

The reason why your mirorred_strategy (from the third code snippet) is not using all GPUs is that your model input is manually given (using the TF1-style feature_tensor tensor) and TensorFlow doesn't know how to automatically distribute data evenly to your GPUs, you may take a look at the docs here.

And the fourth snippet (last one) also fails because the way you use it is not correct, you can try to first construct your model graph and then run the graph in a session, but not put them together, you can try to move the feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data}) outside the for loop. The guide here may illustrate a bit better.

like image 92
Arron Cao Avatar answered Oct 10 '22 18:10

Arron Cao