I want to run vectorization on images using multiple GPUs (for now my script use only one GPU). I have a list of images, graph and session. THe script's output is saved vector. My machine has 3 NVIDIA GPU. Environment: Ubuntu, python 3.7, Tensorflow 2.0 (with GPU support). Here is my code example (initialization session):
def load_graph(frozen_graph_filename):
# We load the protobuf file from the disk and parse it to retrieve the
# unserialized graph_def
with tf.io.gfile.GFile(frozen_graph_filename, "rb") as f:
graph_def = tf.compat.v1.GraphDef()
graph_def.ParseFromString(f.read())
# Then, we import the graph_def into a new Graph and returns it
with tf.Graph().as_default() as graph:
# The name var will prefix every op/nodes in your graph
# Since we load everything in a new graph, this is not needed
tf.import_graph_def(graph_def, name="")
return graph
GRAPH = load_graph(os.path.join(settings.IMAGENET_PATH['PATH'], 'classify_image_graph_def.pb'))
config = tf.compat.v1.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.9
config.gpu_options.allow_growth = True
SESSION = tf.compat.v1.Session(graph=GRAPH, config=config)
After that I called run vectorization as:
sess = SESSION
for image_index, image in enumerate(image_list):
with Image.open(image) as f:
image_data = f.convert('RGB')
feature_tensor = POOL_TENSOR
feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
feature_vector = np.squeeze(feature_set)
outfile_name = os.path.basename(image) + ".vc"
this_is_path = settings.VECTORS_DIR_PATH['PATH']
out_path = os.path.join(this_is_path, outfile_name)
np.savetxt(out_path, feature_vector, delimiter=',')
This worked example runs on first GPU 100 vectors in 29 seconds. So, I tried this distributed training method from Tensorflow docs to run on multiple GPUs:
mirorred_strategy = tf.distribute.MirorredStrategy()
with mirorred_strategy.scope():
sess = SESSION
# and here all the code from previous example after session:
for image_index, image in enumerate(image_list):
with Image.open(image) as f:
image_data = f.convert('RGB')
feature_tensor = POOL_TENSOR
feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
feature_vector = np.squeeze(feature_set)
outfile_name = os.path.basename(image) + ".vc"
this_is_path = settings.VECTORS_DIR_PATH['PATH']
out_path = os.path.join(this_is_path, outfile_name)
np.savetxt(out_path, feature_vector, delimiter=',')
After I checked the logs, I can conclude that Tensorflow has access to all three GPU. However, this does not change anything: when running, Tensorflow is stil using only the first GPU (100 vectors in 29 seconds). Another method I tried is I manually set each item to concrete GPU instance:
sess = SESSION
for image_index, image in enumerate(image_list):
if image_index % 2 == 0:
device_name = '/gpu:1'
elif image_index % 3 == 0:
device_name = '/gpu:2'
else:
device_name = '/gpu:0'
with tf.device(device_name):
with Image.open(image) as f:
image_data = f.convert('RGB')
feature_tensor = POOL_TENSOR
feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
feature_vector = np.squeeze(feature_set)
outfile_name = os.path.basename(image) + ".vc"
this_is_path = settings.VECTORS_DIR_PATH['PATH']
out_path = os.path.join(this_is_path, outfile_name)
np.savetxt(out_path, feature_vector, delimiter=',')
Monitoring this method I observe every GPU being used but no performance speedup is seen because Tensorflow is swapping from one GPU device to another. So, on first item GPU:0
will be used and GPU:1
, GPU:2
are just waiting, on second item GPU:1
will be working and GPU:0
, GPU:2
will be waiting.
I am also tried another Tensorflow strategy from tf docs - without any changes. Also tried to define tf.Session()
inside the for loop - without success. And found this - but cannot make it work for my code.
My questions are:
1) If there a way to modify tf.distribute.MirorredStrategy()
to make Tensorflow use all three GPU?
2) If answer on (1) is not, how can I run vectorization using all GPU power(maybe here exists async way for doing this or something)?
TensorFlow Multiple GPU TensorFlow is an open source framework, created by Google, that you can use to perform machine learning operations. The library includes a variety of machine learning and deep learning algorithms and models that you can use as a base for your training.
The reason why your mirorred_strategy
(from the third code snippet) is not using all GPUs is that your model input is manually given (using the TF1-style feature_tensor
tensor) and TensorFlow doesn't know how to automatically distribute data evenly to your GPUs, you may take a look at the docs here.
And the fourth snippet (last one) also fails because the way you use it is not correct, you can try to first construct your model graph and then run the graph in a session, but not put them together, you can try to move the feature_set = sess.run(feature_tensor, {'DecodeJpeg:0': image_data})
outside the for loop. The guide here may illustrate a bit better.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With