Keras

Question

I have implemented a Keras sequential model using Tensorflow backend for the image classification tasks. It has a few custom layers to replace the Keras layers like conv2d, max-pooling, etc. But after adding these layers, though the accuracy is preserved, the training time has increased by multiple times. So I need to see whether these layers are taking time in the forward or backward pass (through backpropagation) or both, and which of these operations need to be possibly optimized (using Eigen, etc.).

However, I couldn't find any method to know the time taken by each layer/op in the model. Checked the functionality of Tensorboard and Callbacks but couldn't get how to they can help time training details. Is there any method to do this? Thanks for any help.

Akshay Sehgal · Accepted Answer

This is not straightforward since each layer gets trained during each epoch. You can use callbacks to get epoch training time across the whole network however you have to do a sort of patchwork to get what you need (approximate training time of each layer).

The steps -

create a callback to record the runtime of each epoch
Set each layer in the network to non-trainable and only a single layer to trainable.
Train model on a small number of epochs and get average runtime
Loop through steps 2 to 3 for each independent layer in the network
Return results

This is NOT the actual runtime, however, you can do a relative analysis of what layer is taking proportionally higher time than the other.

#Callback class for time history (picked up this solution directly from StackOverflow)

class TimeHistory(Callback):
    def on_train_begin(self, logs={}):
        self.times = []

    def on_epoch_begin(self, batch, logs={}):
        self.epoch_time_start = time.time()

    def on_epoch_end(self, batch, logs={}):
        self.times.append(time.time() - self.epoch_time_start)
        
time_callback = TimeHistory()

# Model definition

inp = Input((inp_dims,))
embed_out = Embedding(vocab_size, 256, input_length=inp_dims)(inp)

x = Conv1D(filters=32, kernel_size=3, activation='relu')(embed_out)
x = MaxPooling1D(pool_size=2)(x)
x = Flatten()(x)

x = Dense(64, activation='relu')(x)
x = Dropout(0.5)(x)
x = Dense(32, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(out_dims, activation='softmax')(x)

model = Model(inp, out)
model.summary()

# Function for approximate training time with each layer independently trained

def get_average_layer_train_time(epochs):
    
    #Loop through each layer setting it Trainable and others as non trainable
    results = []
    for i in range(len(model.layers)):
        
        layer_name = model.layers[i].name    #storing name of layer for printing layer
        
        #Setting all layers as non-Trainable
        for layer in model.layers:
            layer.trainable = False
            
        #Setting ith layers as trainable
        model.layers[i].trainable = True
        
        #Compile
        model.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy', metrics=['acc'])
        
        #Fit on a small number of epochs with callback that records time for each epoch
        model.fit(X_train_pad, y_train_lbl,      
              epochs=epochs, 
              batch_size=128, 
              validation_split=0.2, 
              verbose=0,
              callbacks = [time_callback])
        
        results.append(np.average(time_callback.times))
        #Print average of the time for each layer
        print(f"{layer_name}: Approx (avg) train time for {epochs} epochs = ", np.average(time_callback.times))
    return results

runtimes = get_average_layer_train_time(5)
plt.plot(runtimes)

#input_2: Approx (avg) train time for 5 epochs =  0.4942781925201416
#embedding_2: Approx (avg) train time for 5 epochs =  0.9014601230621337
#conv1d_2: Approx (avg) train time for 5 epochs =  0.822748851776123
#max_pooling1d_2: Approx (avg) train time for 5 epochs =  0.479401683807373
#flatten_2: Approx (avg) train time for 5 epochs =  0.47864508628845215
#dense_4: Approx (avg) train time for 5 epochs =  0.5149370670318604
#dropout_3: Approx (avg) train time for 5 epochs =  0.48329877853393555
#dense_5: Approx (avg) train time for 5 epochs =  0.4966880321502686
#dropout_4: Approx (avg) train time for 5 epochs =  0.48073616027832033
#dense_6: Approx (avg) train time for 5 epochs =  0.49605698585510255

enter image description here

Keras - How to get time taken by each layer in training?

Tags:

python

tensorflow

psj

1 Answers

Akshay Sehgal

Recent Activity

Donate For Us