I want to do a grid search for parameters on neural nets. I have two GPUs, and I would like to run one model on the first GPU, and another model with different parameters on the second GPU. A first attempt that doesn't work goes like this:
with tf.device('/gpu:0'):
model_1 = sequential()
model_1.add(embedding) // the embeddings are defined earlier in the code
model_1.add(LSTM(50))
model_1.add(Dense(5, activation = 'softmax'))
model_1.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
model_1.fit(np.array(train_x), np.array(train_y), epochs = 15, batch_size = 15)
with tf.device('/gpu:1'):
model_2 = sequential()
model_2.add(embedding)
model_2.add(LSTM(100))
model_2.add(Dense(5, activation = 'softmax'))
model_2.compile(loss = 'categorical_crossentropy', optimizer = 'adam')
model_2.fit(np.array(train_x), np.array(train_y), epochs = 15, batch_size = 15)
Edit: I ran my code again and did not get an error. However, the two models run sequentially rather than in parallel. Is it possible to do multithreading here? That is my next attempt.
There is a lot of discussion online about using multiple GPUs with keras, but when it comes to running multiple models simultaneously, the discussion is limited to running multiple models on a single GPU. The discussion regarding multiple GPUs is also limited to data parallelization and device parallelization. I don't believe I want to do either since I am not trying to break up a single model to run on multiple gpus. Is it possible to run two separate models simultaneously in keras with two GPUs?
There are two ways to run a single model on multiple GPUs, data parallelism and device parallelism. In most cases, what you need is most likely data parallelism. Data parallelism consists of replicating the target model once on each device and using each replica to process a different fraction of the input data.
Keras Multi GPU training is not automatic To use multiple GPUs with Keras, you can use the multi_gpu_model method. This method enables you to copy your model across GPUs.
Strategy is a TensorFlow API to distribute training across multiple GPUs, multiple machines, or TPUs. Using this API, you can distribute your existing models and training code with minimal code changes.
PyTorch Multi GPU PyTorch is an open source scientific computing framework based on Python. You can use it to train machine learning models using tensor computations and GPUs. This framework supports distributed training through the torch. distributed backend.
A solution to this problem can be found here. However, the softmax activation function runs on the CPU only as of now. It is necessary to direct the cpu to perform the dense layer:
with tf.device('cpu:0')
Switching between the cpu and the gpu does not seem cause noticeable slow down. With LSTM's though, it may be best to run the entire model on the cpu.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With