First training epoch is very slow

Question

Hi… I’m running mnist code in my P3 AWS machine and the initialization process seems to be very long compared to my previous P2 machine (although P3>P2)

Train on 60000 samples, validate on 10000 samples
Epoch 1/10
60000/60000 [==============================] - 265s 4ms/step - loss: 0.2674 - acc: 0.9175 - val_loss: 0.0602 - val_acc: 0.9811
Epoch 2/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0860 - acc: 0.9742 - val_loss: 0.0393 - val_acc: 0.9866
Epoch 3/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0647 - acc: 0.9808 - val_loss: 0.0338 - val_acc: 0.9884
Epoch 4/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0542 - acc: 0.9839 - val_loss: 0.0337 - val_acc: 0.9887
Epoch 5/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0453 - acc: 0.9863 - val_loss: 0.0311 - val_acc: 0.9900
Epoch 6/10
60000/60000 [==============================] - 3s 51us/step - loss: 0.0412 - acc: 0.9873 - val_loss: 0.0291 - val_acc: 0.9898
Epoch 7/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0368 - acc: 0.9891 - val_loss: 0.0300 - val_acc: 0.9901
Epoch 8/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0340 - acc: 0.9897 - val_loss: 0.0298 - val_acc: 0.9897
Epoch 9/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0320 - acc: 0.9908 - val_loss: 0.0267 - val_acc: 0.9916
Epoch 10/10
60000/60000 [==============================] - 3s 50us/step - loss: 0.0286 - acc: 0.9914 - val_loss: 0.0276 - val_acc: 0.9903
Test loss: 0.02757222411266339
Test accuracy: 0.9903

I’m using Keras=2.1.4 tensorflow-gpu=1.5.0

my keras.json file is configured as follows:

{
    "floatx": "float32",
    "epsilon": 1e-07,
    "backend": "tensorflow",
    "image_data_format": "channels_last"
}

Any ideas why is it like that?

Thanks in advance

Ali Abbasi · Accepted Answer

Based on this issue:

The first epoch takes the same time, but the counter also takes into account the time taken by building the part of the computational graph that deals with training (a few seconds). This used to be done during the compile step, but now it is done lazily one demand to avoid unnecessary work.

First training epoch is very slow

Tags:

tensorflow

deep-learning

keras

mnist

Jenia Golbstein

1 Answers

Ali Abbasi

Recent Activity

Donate For Us

First training epoch is very slow

Tags:

tensorflow

deep-learning

keras

mnist

Jenia Golbstein

1 Answers

Ali Abbasi

Related questions

Recent Activity

Donate For Us