I have the following plot: <img src="https://i.stack.imgur.com/JXW6N.png" alt="enter image description here"> The model is created with the following number of samples: <pre class="prettyprint"><code> class1 class2 train 20 20 validate 21 13 </code></pre> In my understanding, the plot show there is no overfitting. But I think, since the sample is very small, I'm not confident if the model is general enough. Is there any other way to measure overfittingness other than the above plot? This is my complete code: <pre class="prettyprint"><code>library(keras) library(tidyverse) train_dir <- "data/train/" validation_dir <- "data/validate/" # Making model ------------------------------------------------------------ conv_base <- application_vgg16( weights = "imagenet", include_top = FALSE, input_shape = c(150, 150, 3) ) # VGG16 based model ------------------------------------------------------- # Works better with regularizer model <- keras_model_sequential() %>% conv_base() %>% layer_flatten() %>% layer_dense(units = 256, activation = "relu", kernel_regularizer = regularizer_l1(l = 0.01)) %>% layer_dense(units = 1, activation = "sigmoid") summary(model) length(model$trainable_weights) freeze_weights(conv_base) length(model$trainable_weights) # Train model ------------------------------------------------------------- desired_batch_size <- 20 train_datagen <- image_data_generator( rescale = 1 / 255, rotation_range = 40, width_shift_range = 0.2, height_shift_range = 0.2, shear_range = 0.2, zoom_range = 0.2, horizontal_flip = TRUE, fill_mode = "nearest" ) # Note that the validation data shouldn't be augmented! test_datagen <- image_data_generator(rescale = 1 / 255) train_generator <- flow_images_from_directory( train_dir, # Target directory train_datagen, # Data generator target_size = c(150, 150), # Resizes all images to 150 × 150 shuffle = TRUE, seed = 1, batch_size = desired_batch_size, # was 20 class_mode = "binary" # binary_crossentropy loss for binary labels ) validation_generator <- flow_images_from_directory( validation_dir, test_datagen, target_size = c(150, 150), shuffle = TRUE, seed = 1, batch_size = desired_batch_size, class_mode = "binary" ) # Fine tuning ------------------------------------------------------------- unfreeze_weights(conv_base, from = "block3_conv1") # Compile model ----------------------------------------------------------- model %>% compile( loss = "binary_crossentropy", optimizer = optimizer_rmsprop(lr = 2e-5), metrics = c("accuracy") ) # Evaluate by epochs --------------------------------------------------------------- # # This create plots accuracy of various epochs (slow) history <- model %>% fit_generator( train_generator, steps_per_epoch = 100, epochs = 15, # was 50 validation_data = validation_generator, validation_steps = 50 ) plot(history) </code></pre>

So two things here: <ol> <li>Stratify your data w.r.t. classes - your validation data has a completely different class distribution than your training set (train set is balanced whereas validation set - not). This might affect your losses and metrics values. It's better to stratify your results so the class ratio would be the same for both sets.</li> <li> With a so few data points use more rough validation schemas - as you may see you have only 74 images in total. In this case - it's not a problem to load all images to <code>numpy.array</code> (you still could have data augmentation using <code>flow</code> function) and use validation schemas which are hard to obtain when you have your data in a folder. The schemas (from <code>sklearn</code>) which I advice you to use are: <ul> <li> stratified k-fold cross-validation - where you divide your data into k chunks - and for each selection of k - 1 chunks - you first train your model on k - 1 and then compute metrics on the one which was left for validation. The final result is a mean out of results obtained on validation chunks. You could, of course, check not only mean but also other statistics of losses distribution (like e.g. min, max, median, etc.). You could also compare them with results obtained on a training set for each fold.</li> <li> leave-one-out - this is a special case of previous schema - where the number of chunks / folds is equal to the number of examples in your dataset. This method is considered as the roughest way of measuring your model performance. It's rarely used in deep learning because of the fact that training process is usually to slow and datasets are to big in order to accomplish computations in a reasonable time.</li> </ul> </li> </ol>

How to measure overfitting when train and validation sample is small in Keras model

I have the following plot:

enter image description here

The model is created with the following number of samples:

                class1     class2
train             20         20
validate          21         13

In my understanding, the plot show there is no overfitting. But I think, since the sample is very small, I'm not confident if the model is general enough.

Is there any other way to measure overfittingness other than the above plot?

This is my complete code:

library(keras)
library(tidyverse)


train_dir <- "data/train/"
validation_dir <- "data/validate/"



# Making model ------------------------------------------------------------


conv_base <- application_vgg16(
  weights = "imagenet",
  include_top = FALSE,
  input_shape = c(150, 150, 3)
)

# VGG16 based model -------------------------------------------------------

# Works better with regularizer
model <- keras_model_sequential() %>%
  conv_base() %>%
  layer_flatten() %>%
  layer_dense(units = 256, activation = "relu", kernel_regularizer = regularizer_l1(l = 0.01)) %>%
  layer_dense(units = 1, activation = "sigmoid")

summary(model)

length(model$trainable_weights)
freeze_weights(conv_base)
length(model$trainable_weights)


# Train model -------------------------------------------------------------
desired_batch_size <- 20 

train_datagen <- image_data_generator(
  rescale = 1 / 255,
  rotation_range = 40,
  width_shift_range = 0.2,
  height_shift_range = 0.2,
  shear_range = 0.2,
  zoom_range = 0.2,
  horizontal_flip = TRUE,
  fill_mode = "nearest"
)

# Note that the validation data shouldn't be augmented!
test_datagen <- image_data_generator(rescale = 1 / 255)


train_generator <- flow_images_from_directory(
  train_dir, # Target directory
  train_datagen, # Data generator
  target_size = c(150, 150), # Resizes all images to 150 × 150
  shuffle = TRUE,
  seed = 1,
  batch_size = desired_batch_size, # was 20
  class_mode = "binary" # binary_crossentropy loss for binary labels
)

validation_generator <- flow_images_from_directory(
  validation_dir,
  test_datagen,
  target_size = c(150, 150),
  shuffle = TRUE,
  seed = 1,
  batch_size = desired_batch_size,
  class_mode = "binary"
)

# Fine tuning -------------------------------------------------------------

unfreeze_weights(conv_base, from = "block3_conv1")

# Compile model -----------------------------------------------------------



model %>% compile(
  loss = "binary_crossentropy",
  optimizer = optimizer_rmsprop(lr = 2e-5),
  metrics = c("accuracy")
)


# Evaluate  by epochs  ---------------------------------------------------------------


#  # This create plots accuracy of various epochs (slow)
history <- model %>% fit_generator(
  train_generator,
  steps_per_epoch = 100,
  epochs = 15, # was 50
  validation_data = validation_generator,
  validation_steps = 50
)

plot(history)

How do I know if my keras is overfitting?

We can identify overfitting by looking at validation metrics, like loss or accuracy. Usually, the validation metric stops improving after a certain number of epochs and begins to decrease afterward. The training metric continues to improve because the model seeks to find the best fit for the training data.

How do you know that a machine learning system is overfitting using the training validation data?

The performance can be measured using the percentage of accuracy observed in both data sets to conclude on the presence of overfitting. If the model performs better on the training set than on the test set, it means that the model is likely overfitting.

What if validation loss is less than training loss?

Sometimes data scientists come across cases where their validation loss is lower than their training loss. This is a weird observation because the model is learning from the training set, so it should be able to predict the training set better, yet we observe higher training loss.

How can you tell if a model you have trained is overfitting?

Overfitting is easy to diagnose with the accuracy visualizations you have available. If "Accuracy" (measured against the training set) is very good and "Validation Accuracy" (measured against a validation set) is not as good, then your model is overfitting.

So two things here:

Stratify your data w.r.t. classes - your validation data has a completely different class distribution than your training set (train set is balanced whereas validation set - not). This might affect your losses and metrics values. It's better to stratify your results so the class ratio would be the same for both sets.
With a so few data points use more rough validation schemas - as you may see you have only 74 images in total. In this case - it's not a problem to load all images to numpy.array (you still could have data augmentation using flow function) and use validation schemas which are hard to obtain when you have your data in a folder. The schemas (from sklearn) which I advice you to use are:
- stratified k-fold cross-validation - where you divide your data into k chunks - and for each selection of k - 1 chunks - you first train your model on k - 1 and then compute metrics on the one which was left for validation. The final result is a mean out of results obtained on validation chunks. You could, of course, check not only mean but also other statistics of losses distribution (like e.g. min, max, median, etc.). You could also compare them with results obtained on a training set for each fold.
- leave-one-out - this is a special case of previous schema - where the number of chunks / folds is equal to the number of examples in your dataset. This method is considered as the roughest way of measuring your model performance. It's rarely used in deep learning because of the fact that training process is usually to slow and datasets are to big in order to accomplish computations in a reasonable time.

Your validation loss is constantly lower than the training loss. I would be quite suspicious of your results. If you look at the validation accuracy, it just shouldn't be like that.

The less data you have, the less confidence you can have in anything. So you are right when you are not sure about overfitting. The only thing that works here is to gather more data, either by data augmentation, or combining with another dataset.

How to measure overfitting when train and validation sample is small in Keras model

Tags:

r

machine-learning

neural-network

deep-learning

keras

pdubois

People also ask

2 Answers

Marcin Możejko

Aleksandar Jovanovic

Recent Activity

Donate For Us

How to measure overfitting when train and validation sample is small in Keras model

Tags:

r

machine-learning

neural-network

deep-learning

keras

pdubois

People also ask

2 Answers

Marcin Możejko

Aleksandar Jovanovic

Related questions

Recent Activity

Donate For Us