Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Whether Data augmentation really needed in Machine Learning [closed]

I am interested in knowing the importance of data augmentation(rotation at various angles, flipping the images) while providing a dataset to a Machine Learning problem.

Whether it is really needed? Or the CNN networks using will handle that as well no matter how different the data are transformed?

So I took a classification task with 2 classes to conclude some results

  1. Arrow shapes
  2. Circle shapes

The idea is to train the shapes with only one orientation(I have taken arrows pointing right) and check the model with a different orientation(I have taken arrows pointing downwards) which is not at all given during the training stage.

Some of the samples used in Training

enter image description here enter image description here enter image description here enter image description here

Some of the samples used in Testing

enter image description here enter image description here

This is the entire dataset I am using in for creating a tensorflow model. https://bitbucket.org/akhileshmalviya/samples/src/bab50b85d826?at=master

I am wondering with the results I got,

(i) Except a few downward arrows all others are getting predicted correctly as arrow. Does it mean data augmentation is not at all needed?

(ii) Or is this the right use case I have taken to understand the importance of data augmentation?

Kindly share your thoughts, Any help could be really appreciated!

like image 475
Karthik Avatar asked Jun 22 '17 09:06

Karthik


People also ask

Is data augmentation necessary?

Importance of Data Augmentation And data augmentation acts as a tool against these challenges. It is useful in improving performances and outcomes of machine learning models. The data augmentation tools make the data rich and sufficient and thus makes the model perform better and accurately.

Why data augmentation is done?

Data augmentation is useful to improve performance and outcomes of machine learning models by forming new and different examples to train datasets. If the dataset in a machine learning model is rich and sufficient, the model performs better and more accurately.

How effective is data augmentation?

This improved classification performance from 78.6% sensitivity and 88.4% specificity using classic augmentations to 85.7% sensitivity and 92.4% specificity using GAN-based Data Augmentation. Most of the augmentations covered focus on improving Image Recognition models.

Can data augmentation reduce accuracy?

Datasets that are created through data augmentation are useful because they can improve the predictive accuracy and general performance of machine learning models by reducing the risk of overfitting—where models catch inaccurate values present in a dataset.


1 Answers

Data augmentation is a data-depended process.

In general, you need it when your training data is complex and you have a few samples.

A neural network can easily learn to extract simple patterns like arcs or straight lines and these patterns are enough to classify your data.

In your case data augmentation can barely help, the features the network will learn to extract are easy and highly different from each other.

When you, instead, have to deal with complex structures (cats, dogs, airplanes, ...) you can't rely on simple features like edges, arcs, etc.. Instead, you have to show to your network that the instances you're trying to classify got an high variance and that the features extracted can be combined in a lot of different ways for the same subject.

Think about a cat: it can be of any color, the picture can be taken in different light conditions, its whole body can be in any position, the picture could be taken with a certain orientation... To correctly classify instances so different, the network must learn to extract robust features that could be learned only after seeing a lot of different inputs.

In your case, instead, simple features can completely discriminate your input, thus any sort of data augmentation could help by just a little bit.

like image 114
nessuno Avatar answered Sep 25 '22 01:09

nessuno