Why I'm confused:
If I test my model on examples [A, B, C], it will obtain a certain accuracy. If I test the same model on examples [C, B, A], it should obtain the same accuracy. In other words, shuffling the examples shouldn't change my model's accuracy. But that's what seems to be happening below:
Step-by-step:
Here is where I train the model:
model.fit_generator(batches, batches.nb_sample, nb_epoch=1, verbose=2,
validation_data=val_batches,
nb_val_samples=val_batches.nb_sample)
Here is where I test the model, without shuffling the validation set:
gen = ImageDataGenerator()
results = []
for _ in range(3):
val_batches = gen.flow_from_directory(path+"valid", batch_size=batch_size*2,
target_size=target_size, shuffle=False)
result = model.evaluate_generator(val_batches, val_batches.nb_sample)
results.append(result)
Here are the results (val_loss, val_acc):
[2.8174608421325682, 0.17300000002980231]
[2.8174608421325682, 0.17300000002980231]
[2.8174608421325682, 0.17300000002980231]
Notice that the validation accuracies are the same.
Here is where I test the model, with a shuffled validation set:
results = []
for _ in range(3):
val_batches = gen.flow_from_directory(path+"valid", batch_size=batch_size*2,
target_size=target_size, shuffle=True)
result = model.evaluate_generator(val_batches, val_batches.nb_sample)
results.append(result)
Here are the results (val_loss, val_acc):
[2.8174608802795409, 0.17299999999999999]
[2.8174608554840086, 0.1730000001192093]
[2.8174608268737793, 0.17300000059604645]
Notice that the validation accuracies are inconsistent, despite an unchanged validation set and an unchanged model. What's going on?
Note:
I'm evaluating on the entire validation set each time. model.evaluate_generator returns after evaluating the model on the number of examples equal to val_batches.nb_sample
, which is the number of examples in the validation set.
This is a really interesting problem. The answer is that it's because of that neural networks are using a float32
format which is not so accurate as float64
- the fluctuation like this are simply the realisation of an underflow phenomenon.
It case of your loss - you may notice that the differences are occuring after 7th decimal digit of a fractional part - what is exactly the precision of a float32
format. So - basically - you may assume that all numbers presented in your example are equal in terms of a float32
representation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With