Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I keep track of the time the CPU is used vs the GPUs for deep learning?

I'm interested in knowing how much time of my script runtime is spent on the CPU vs the GPU - is there a way to track this?

Looking for a generic answer, but if that's too abstract one for this toy solution (from keras's multi_gpu_model examples) would be great.

import tensorflow as tf
from keras.applications import Xception
from keras.utils import multi_gpu_model
import numpy as np
num_samples = 1000
height = 224
width = 224
num_classes = 1000
# Instantiate the base model (or "template" model).
# We recommend doing this with under a CPU device scope,
# so that the model's weights are hosted on CPU memory.
# Otherwise they may end up hosted on a GPU, which would
# complicate weight sharing.
with tf.device('/cpu:0'):
    model = Xception(weights=None,
                     input_shape=(height, width, 3),
                     classes=num_classes)
# Replicates the model on 8 GPUs.
# This assumes that your machine has 8 available GPUs.
parallel_model = multi_gpu_model(model, gpus=8)
parallel_model.compile(loss='categorical_crossentropy',
                       optimizer='rmsprop')
# Generate dummy data.
x = np.random.random((num_samples, height, width, 3))
y = np.random.random((num_samples, num_classes))
# This `fit` call will be distributed on 8 GPUs.
# Since the batch size is 256, each GPU will process 32 samples.
parallel_model.fit(x, y, epochs=20, batch_size=256)
# Save model via the template model (which shares the same weights):
model.save('my_model.h5')
like image 692
dward4 Avatar asked Apr 16 '18 13:04

dward4


Video Answer


1 Answers

All you need to add is the Chrome-based timeline profiling for both CPU/GPU from Tensorflow API to your Keras model!

Here is example provided in Tensorflow issue tracker:

https://github.com/tensorflow/tensorflow/issues/9868#issuecomment-306188267

This is a more complicated example in Keras issue tracker:

https://github.com/keras-team/keras/issues/6606#issuecomment-380196635

Finally this is how the output of this profiling looks like:

https://towardsdatascience.com/howto-profile-tensorflow-1a49fb18073d

enter image description here

like image 133
denfromufa Avatar answered Nov 04 '22 23:11

denfromufa