model.fit_generator() fails with use_multiprocessing=True

Question

In the code example below, I can train the model only when NOT using multiprocessing.

My generator is straight from the tensorflow.keras.utils.Sequence description https://www.tensorflow.org/api_docs/python/tf/keras/utils/Sequence

Any idea how to fix the generator to allow multiprocessing?

Running on Win 10, tensorflow 1.13.1, python 3.6.8

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.utils import Sequence


# Generator
class DataGenerator(Sequence):

        def __init__(self, dim, batch_size, n_channels):

            self.dim = dim            
            self.batch_size = batch_size
            self.n_channels = n_channels

        def __len__(self):
            return 100

        def __getitem__(self, idx):

            X = np.random.randn(self.batch_size, self.dim, self.n_channels)
            Y = np.random.randn(self.batch_size, self.dim, 1)

            return X, Y


dim= 32
batch_size= 64
n_channels= 3

# Generators
training_generator = DataGenerator(dim, batch_size, n_channels)
validation_generator = DataGenerator(dim, batch_size, n_channels)


# Model
model = Sequential()
model.add(layers.GRU(128, return_sequences=True, 
                     batch_input_shape=[None, training_generator.dim, training_generator.n_channels]))
model.add(layers.Dense(1))

model.compile(loss='mse', optimizer='adam')


# This training procedure runs
model.fit_generator(generator=training_generator,
                    epochs = 2,
                    steps_per_epoch = 100,
                    max_queue_size = 32,
                    validation_data=validation_generator,
                    validation_steps = 20,
                    verbose=1)

# This training procedure fails (Only change is that I added the multiprocessing options)
model.fit_generator(generator=training_generator,
                    epochs = 2,
                    steps_per_epoch = 100,
                    max_queue_size = 32,
                    validation_data=validation_generator,
                    validation_steps = 20,
                    verbose=1,
                    use_multiprocessing=True,
                    workers=4)

I expected the second fit_generator() call to train the model like the first one. Instead, I get no output, not even an error message.

TML · Accepted Answer

I tried your code on Ubuntu 18.04.2 LTS machine with python 3.6.8 and tensorflow 1.13.1. It works in both cases as log shown below:

2019-07-13 12:56:17.003119: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
100/100 [==============================] - 3s 27ms/step - loss: 0.9987
100/100 [==============================] - 10s 103ms/step - loss: 0.9973 - val_loss: 0.9987
Epoch 2/2
100/100 [==============================] - 3s 26ms/step - loss: 0.9955
100/100 [==============================] - 8s 83ms/step - loss: 1.0028 - val_loss: 0.9955
Multiprocessing=True ......
Epoch 1/2
100/100 [==============================] - 3s 32ms/step - loss: 0.9952
100/100 [==============================] - 9s 89ms/step - loss: 0.9962 - val_loss: 0.9952
Epoch 2/2
100/100 [==============================] - 3s 28ms/step - loss: 0.9967
100/100 [==============================] - 9s 86ms/step - loss: 0.9968 - val_loss: 0.9967"

My suggestion is to first try with CPU only mode, by putting BOTH the model and the fit_generator code under "with tf.device('/cpu:0'):". If it works, it would be GPU related issue, such as proper driver, tensorflow with GPU support etc. Most likely, the issue was caused by GPU hanging.

model.fit_generator() fails with use_multiprocessing=True

Tags:

generator

multiprocessing

tensorflow

keras

Simon Schmickler

1 Answers

TML

Recent Activity

Donate For Us

model.fit_generator() fails with use_multiprocessing=True

Tags:

generator

multiprocessing

tensorflow

keras

Simon Schmickler

1 Answers

TML

Related questions

Recent Activity

Donate For Us