Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Data augmentation in test/validation set?

It is common practice to augment data (add samples programmatically, such as random crops, etc. in the case of a dataset consisting of images) on both training and test set, or just the training data set?

like image 815
rodrigo-silveira Avatar asked Dec 29 '17 23:12

rodrigo-silveira


People also ask

Should you apply data augmentation to validation set?

Not having a single validation fold, if anything allows us to gauge how variable our learner's performance is. So, yes, absolutely I would use data augmentation on the validation set as I see the validation and training set as a natural extension of one another in the case of repeated CV.

Does data augmentation reduce test error?

Abstract: Empirically, data augmentation sometimes improves and sometimes hurts test error, even when only adding points with labels from the true conditional distribution that the hypothesis class is expressive enough to fit.

Can I use validation set as test set?

Generally, the term “validation set” is used interchangeably with the term “test set” and refers to a sample of the dataset held back from training the model. The evaluation of a model skill on the training dataset would result in a biased score.

What is test time augmentation?

Test time augmentation (TTA) is a popular technique in computer vision. TTA aims at boosting the model accuracy by using data augmentation on the inference stage. The idea behind TTA is simple: for each test image, we create multiple versions that are a little different from the original (e.g., cropped or flipped).


1 Answers

Only on training. Data augmentation is used to increase the size of training set and to get more different images. Technically, you could use data augmentation on test set to see how model behaves on such images, but usually people don't do it.

like image 155
Andrey Lukyanenko Avatar answered Sep 28 '22 03:09

Andrey Lukyanenko