I'm using ImageDataGenerator
and flow_from_directory
to generate my data, and
using model.fit_generator
to fit the data.
This defaults to outputting the accuracy for training data set only. There doesn't seem to be an option to output validation accuracy to the terminal.
Here is the relevant portion of my code:
#train data generator
print('Starting Preprocessing')
train_datagen = ImageDataGenerator(preprocessing_function = preprocess)
train_generator = train_datagen.flow_from_directory(
train_data_dir,
target_size = (img_height, img_width),
batch_size = batch_size,
class_mode = 'categorical') #class_mode = 'categorical'
#same for validation
val_datagen = ImageDataGenerator(preprocessing_function = preprocess)
validation_generator = val_datagen.flow_from_directory(
validation_data_dir,
target_size = (img_height, img_width),
batch_size=batch_size,
class_mode='categorical')
########################Model Creation###################################
#create the base pre-trained model
print('Finished Preprocessing, starting model creating \n')
base_model = InceptionV3(weights='imagenet', include_top=False)
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(1024, activation='relu')(x)
predictions = Dense(12, activation='softmax')(x)
model = Model(input=base_model.input, output=predictions)
for layer in model.layers[:-34]:
layer.trainable = False
for layer in model.layers[-34:]:
layer.trainable = True
from keras.optimizers import SGD
model.compile(optimizer=SGD(lr=0.001, momentum=0.92),
loss='categorical_crossentropy',
metrics = ['accuracy'])
#############SAVE Model #######################################
file_name = str(datetime.datetime.now()).split(' ')[0] + '_{epoch:02d}.hdf5'
filepath = os.path.join(save_dir, file_name)
checkpoints =ModelCheckpoint(filepath, monitor='val_acc', verbose=1,
save_best_only=False, save_weights_only=False,
mode='auto', period=2)
###############Fit Model #############################
model.fit_generator(
train_generator,
steps_per_epoch =total_samples//batch_size,
epochs = epochs,
validation_data=validation_generator,
validation_steps=total_validation//batch_size,
callbacks = [checkpoints],
shuffle= True)
UPDATE OUTPUT:
Throughout training, I'm only getting the output of training accuracy, but at the end of training, I"m getting both training, validation accuracy.
Epoch 1/10
1/363 [..............................] - ETA: 1:05:58 - loss: 2.4976 - acc: 0.0640
2/363 [..............................] - ETA: 51:33 - loss: 2.4927 - acc: 0.0760
3/363 [..............................] - ETA: 48:55 - loss: 2.5067 - acc: 0.0787
4/363 [..............................] - ETA: 47:26 - loss: 2.5110 - acc: 0.0770
5/363 [..............................] - ETA: 46:30 - loss: 2.5021 - acc: 0.0824
6/363 [..............................] - ETA: 45:56 - loss: 2.5063 - acc: 0.0820
You can do this by setting the validation_split argument on the fit() function to a percentage of the size of your training dataset. For example, a reasonable value might be 0.2 or 0.33 for 20% or 33% of your training data held back for validation.
Having a very large epoch size will not necessarily improve your accuracy. Epoch sizes can increase the accuracy up to a certain limit beyond which you begin to overfit your model. Having a very low one will also result in underfitting.
Increase Epochs Increasing epochs makes sense only if you have a lot of data in your dataset. However, your model will eventually reach a point where increasing epochs will not improve accuracy. At this point, you should consider playing around with your model's learning rate.
The right number of epochs depends on the inherent perplexity (or complexity) of your dataset. A good rule of thumb is to start with a value that is 3 times the number of columns in your data. If you find that the model is still improving after all epochs complete, try again with a higher value.
The idea is that you go through you validation set after each epoch, not after each batch. If after every batch, you had to evaluate the performances of the model on the whole validation set, you would loose a lot of time.
After each epoch, you will have the corresponding losses and accuracies both for training and validation. But during one epoch, you will only have access to the training loss and accuracy.
In fit_generator
,
fit_generator(generator, steps_per_epoch=None, epochs=1, verbose=1, callbacks=None, **validation_data=None, validation_steps=None**, validation_freq=1, class_weight=None, max_queue_size=10, workers=1, use_multiprocessing=False, shuffle=True, initial_epoch=0)
since there is no validation_split
parameter, you can create two different ImageDataGenerator
flow, one for training and one for validating and then place that 'validation_generator' in validation_data
. Then it will print the validation loss and accuracy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With