In Neural Networks: accuracy improvement after each epoch is GREATER than accuracy improvement after each batch. Why?

Tags:

I am training a neural network in batches with Keras 2.0 package for Python. Below is some information about the data and the training parameters:

#samples in train: 414934
#features: 590093
#classes: 2 (binary classification problem)
batch size: 1024
#batches = 406 (414934 / 1024 = 405.2)

Below are some logs of the follow code:

for i in range(epochs):
    print("train_model:: starting epoch {0}/{1}".format(i + 1, epochs))
    model.fit_generator(generator=batch_generator(data_train, target_train, batch_size),
                        steps_per_epoch=num_of_batches,
                        epochs=1,
                        verbose=1)

(partial) Logs:

train_model:: starting epoch 1/3                                                            
Epoch 1/1                                                                                   
  1/406 [..............................] - ETA: 11726s - loss: 0.7993 - acc: 0.5996         
  2/406 [..............................] - ETA: 11237s - loss: 0.7260 - acc: 0.6587         
  3/406 [..............................] - ETA: 14136s - loss: 0.6619 - acc: 0.7279         
404/406 [============================>.] - ETA: 53s - loss: 0.3542 - acc: 0.8917            
405/406 [============================>.] - ETA: 26s - loss: 0.3541 - acc: 0.8917            
406/406 [==============================] - 10798s - loss: 0.3539 - acc: 0.8918              
train_model:: starting epoch 2/3                                                            
Epoch 1/1                                                                                   
  1/406 [..............................] - ETA: 15158s - loss: 0.2152 - acc: 0.9424         
  2/406 [..............................] - ETA: 14774s - loss: 0.2109 - acc: 0.9419         
  3/406 [..............................] - ETA: 16132s - loss: 0.2097 - acc: 0.9408         
404/406 [============================>.] - ETA: 64s - loss: 0.2225 - acc: 0.9329            
405/406 [============================>.] - ETA: 32s - loss: 0.2225 - acc: 0.9329            
406/406 [==============================] - 13127s - loss: 0.2225 - acc: 0.9329              
train_model:: starting epoch 3/3                                                            
Epoch 1/1                                                                                   
  1/406 [..............................] - ETA: 22631s - loss: 0.1145 - acc: 0.9756         
  2/406 [..............................] - ETA: 24469s - loss: 0.1220 - acc: 0.9688         
  3/406 [..............................] - ETA: 23475s - loss: 0.1202 - acc: 0.9691         
404/406 [============================>.] - ETA: 60s - loss: 0.1006 - acc: 0.9745            
405/406 [============================>.] - ETA: 31s - loss: 0.1006 - acc: 0.9745            
406/406 [==============================] - 11147s - loss: 0.1006 - acc: 0.9745

My question is: what happens after each epoch that improves the accuracy like that? For example, the accuracy at the end of the first epoch is 0.8918, but at the beginning of the second epoch accuracy of 0.9424 is observed. Similarly, the accuracy at the end of the second epoch is 0.9329, but the third epoch starts with accuracy of 0.9756.

I would expect to find an accuracy of ~0.8918 at the beginning of the second epoch, and ~0.9329 at the beginning of the third epoch.

I know that in each batch there is one forward pass and one backward pass of training samples in the batch. Thus, in each epoch there is one forward pass and one backward pass of all training samples.

Also, from Keras documentation:

Epoch: an arbitrary cutoff, generally defined as "one pass over the entire dataset", used to separate training into distinct phases, which is useful for logging and periodic evaluation.

Why is the accuracy improvement within each epoch is smaller than the accuracy improvement between the end of epoch X and the beginning of epoch X+1?

339

asked May 23 '17 09:05

Mockingbird

2 Answers

This has nothing to do with your model or your dataset; the reason for this "jump" lies in how metrics are calculated and displayed in Keras.

As Keras processes batch after batch, it saves accuracies at each one of them, and what it displays to you is not the accuracy on the latest processed batch, but the average over all batches in the current epoch. And, as the model is being trained, accuracies over successive batches tend to improve.

Now consider: in the first epoch, let's say, there are 50 batches, and network went from 0% to 90% during these 50 batches. Then at the end of the epoch Keras will show accuracy of, e.g. (0 + 0.1 + 0.5 + ... + 90) / 50%, which is, obviously, much less than 90%! But, because your actual accuracy is 90%, the first batch of the second epoch will show 90%, giving the impression of a sudden "jump" in quality. The same, obviously, goes for loss or any other metric.

Now, if you want more realistic and trustworthy calculation of accuracy, loss, or any other metric you may find yourself using, I would suggest using validation_data parameter in model.fit[_generator] to provide validation data, which will not be used for training, but will be used only to evaluate the network at the end of each epoch, without averaging over various points in time.

137

answered Oct 11 '22 14:10

Akiiino

The accuracy at the end of an epoch is the accuracy over the full dataset. The accuracy after each batch is the accuracy over all batches that are used for training at that moment. It could be the case that your first batch is predicted very well and the following batches have a lower accuracy. In that case the accuracy over your full dataset will be low compared to the accuracy of your first batch.

answered Oct 11 '22 14:10

Wilmar van Ommeren

Related questions
                            
                                Impossible to install py3exiv2 with pip?
                            
                                Boto3 script to create instance tag
                            
                                Pythonic implementation of quiet / verbose flag for functions
                            
                                Normal Distribution Plot by name from pandas dataframe
                            
                                How do I extract the date/year/month from pandas dataframe?
                            
                                How can I find contours inside ROI using opencv and Python?
                            
                                Setting up the path so AWS cli works properly
                            
                                Formatting two numbers in Python
                            
                                Python: Generating all n-length arrays combinations of values within a range
                            
                                Python Plotly, how to remove background horizontal line?
                            
                                Customized float formatting in a pandas DataFrame
                            
                                How to make simple alarms on Python
                            
                                Checking if a data series is strings
                            
                                How to resize frame's from video with aspect ratio
                            
                                Keras Image data generator throwing no files found error?
                            
                                Get two return values from Pandas apply
                            
                                Removing all spaces in text file with Python 3.x
                            
                                Convert html source code to json object
                            
                                python setup.py install without sudo
                            
                                Custom colors in matplotlib when using matshow

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

In Neural Networks: accuracy improvement after each epoch is GREATER than accuracy improvement after each batch. Why?

Tags:

python

neural-network

tensorflow

deep-learning

keras

Mockingbird

People also ask

2 Answers

Akiiino

Wilmar van Ommeren

Recent Activity

Donate For Us