Data Augmentation in PyTorch

Tags:

I am a little bit confused about the data augmentation performed in PyTorch. Now, as far as I know, when we are performing data augmentation, we are KEEPING our original dataset, and then adding other versions of it (Flipping, Cropping...etc). But that doesn't seem like happening in PyTorch. As far as I understood from the references, when we use data.transforms in PyTorch, then it applies them one by one. So for example:

data_transforms = {
    'train': transforms.Compose([
        transforms.RandomResizedCrop(224),
        transforms.RandomHorizontalFlip(),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
    'val': transforms.Compose([
        transforms.Resize(256),
        transforms.CenterCrop(224),
        transforms.ToTensor(),
        transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])
    ]),
}

Here , for the training, we are first randomly cropping the image and resizing it to shape (224,224). Then we are taking these (224,224) images and horizontally flipping them. Therefore, our dataset is now containing ONLY the horizontally flipped images, so our original images are lost in this case.

Am I right? Is this understanding correct? If not, then where do we tell PyTorch in this code above (taken from Official Documentation) to keep the original images and resize them to the expected shape (224,224)?

Thanks

547

asked Aug 03 '18 17:08

Fawaz

5 Answers

I assume you are asking whether these data augmentation transforms (e.g. RandomHorizontalFlip) actually increase the size of the dataset as well, or are they applied on each item in the dataset one by one and not adding to the size of the dataset.

Running the following simple code snippet we could observe that the latter is true, i.e. if you have a dataset of 8 images, and create a PyTorch dataset object for this dataset when you iterate through the dataset, the transformations are called on each data point, and the transformed data point is returned. So for example if you have random flipping, some of the data points are returned as original, some are returned as flipped (e.g. 4 flipped and 4 original). In other words, by one iteration through the dataset items, you get 8 data points(some flipped and some not). [Which is at odds with the conventional understanding of augmenting the dataset(e.g. in this case having 16 data points in the augmented dataset)]

class experimental_dataset(Dataset):      def __init__(self, data, transform):         self.data = data         self.transform = transform      def __len__(self):         return len(self.data.shape[0])      def __getitem__(self, idx):         item = self.data[idx]         item = self.transform(item)         return item      transform = transforms.Compose([         transforms.ToPILImage(),         transforms.RandomHorizontalFlip(),         transforms.ToTensor()     ])  x = torch.rand(8, 1, 2, 2) print(x)  dataset = experimental_dataset(x,transform)  for item in dataset:     print(item)

Results: (The little differences in floating points are caused by transforming to pil image and back)

Original dummy dataset:

tensor([[[[0.1872, 0.5518],           [0.5733, 0.6593]]],       [[[0.6570, 0.6487],       [0.4415, 0.5883]]],       [[[0.5682, 0.3294],       [0.9346, 0.1243]]],       [[[0.1829, 0.5607],       [0.3661, 0.6277]]],       [[[0.1201, 0.1574],       [0.4224, 0.6146]]],       [[[0.9301, 0.3369],       [0.9210, 0.9616]]],       [[[0.8567, 0.2297],       [0.1789, 0.8954]]],       [[[0.0068, 0.8932],       [0.9971, 0.3548]]]])

transformed dataset:

tensor([[[0.1843, 0.5490],      [0.5725, 0.6588]]]) tensor([[[0.6549, 0.6471],      [0.4392, 0.5882]]]) tensor([[[0.5647, 0.3255],          [0.9333, 0.1216]]]) tensor([[[0.5569, 0.1804],          [0.6275, 0.3647]]]) tensor([[[0.1569, 0.1176],          [0.6118, 0.4196]]]) tensor([[[0.9294, 0.3333],          [0.9176, 0.9608]]]) tensor([[[0.8549, 0.2275],          [0.1765, 0.8941]]]) tensor([[[0.8902, 0.0039],          [0.3529, 0.9961]]])

192

answered Sep 21 '22 04:09

Ashkan372

The transforms operations are applied to your original images at every batch generation. So your dataset is left unchanged, only the batch images are copied and transformed every iteration.

The confusion may come from the fact that often, like in your example, transforms are used both for data preparation (resizing/cropping to expected dimensions, normalizing values, etc.) and for data augmentation (randomizing the resizing/cropping, randomly flipping the images, etc.).

What your data_transforms['train'] does is:

Randomly resize the provided image and randomly crop it to obtain a (224, 224) patch
Apply or not a random horizontal flip to this patch, with a 50/50 chance
Convert it to a Tensor
Normalize the resulting Tensor, given the mean and deviation values you provided

What your data_transforms['val'] does is:

Resize your image to (256, 256)
Center crop the resized image to obtain a (224, 224) patch
Convert it to a Tensor
Normalize the resulting Tensor, given the mean and deviation values you provided

(i.e. the random resizing/cropping for the training data is replaced by a fixed operation for the validation one, to have reliable validation results)

If you don't want your training images to be horizontally flipped with a 50/50 chance, just remove the transforms.RandomHorizontalFlip() line.

Similarly, if you want your images to always be center-cropped, replace transforms.RandomResizedCrop by transforms.Resize and transforms.CenterCrop, as done for data_transforms['val'].

answered Sep 22 '22 04:09

benjaminplanche

Yes the dataset size does not change after the transformations. Every Image is passed to the transformation and returned, thus the size remaining the same.

If you wish to use the original dataset with transformed one concat them.

e.g increased_dataset = torch.utils.data.ConcatDataset([transformed_dataset,original])

answered Sep 20 '22 04:09

mohit kaushik

TLDR :

The transform operation applies a bunch of transforms with a certain probability to the input batch that comes in the loop. So the model now is exposed to more examples during the course of multiple epochs.
Personally, when I was Training an audio classification model on my own dataset, before augmentation, my model always seem to converge at 72 % accuracy. I used augmentation along with an increased number of training epochs, Which boosted the validation accuracy in the test set to 89 percent.

answered Sep 24 '22 04:09

bad programmer

The purpose of data augumentation is to increase the diversity of training dataset.

Even though the data.transforms doesn't change the size of dataset, however, every epoch we recall the dataset, the transforms operation will be executed and then get different data.

I changed @Ashkan372 code slightly to output data for multiple epochs：

import torch
from torchvision import transforms
from torch.utils.data import TensorDataset as Dataset
from torch.utils.data import DataLoader

class experimental_dataset(Dataset):
  def __init__(self, data, transform):
    self.data = data
    self.transform = transform

  def __len__(self):
    return self.data.shape[0]

  def __getitem__(self, idx):
    item = self.data[idx]
    item = self.transform(item)
    return item

transform = transforms.Compose([
  transforms.ToPILImage(),
  transforms.RandomHorizontalFlip(),
  transforms.ToTensor()
])

x = torch.rand(8, 1, 2, 2)
print('the original data: \n', x)

epoch_size = 3
batch_size = 4

dataset = experimental_dataset(x,transform)
for i in range(epoch_size):
  print('----------------------------------------------')
  print('the epoch', i, 'data: \n')
  for item in DataLoader(dataset, batch_size, shuffle=False):
    print(item)

The output is:

the original data: 
 tensor([[[[0.5993, 0.5898],
          [0.7365, 0.5472]]],


        [[[0.1878, 0.3546],
          [0.2124, 0.8324]]],


        [[[0.9321, 0.0795],
          [0.4090, 0.9513]]],


        [[[0.2825, 0.6954],
          [0.3737, 0.0869]]],


        [[[0.2123, 0.7024],
          [0.6270, 0.5923]]],


        [[[0.9997, 0.9825],
          [0.0267, 0.2910]]],


        [[[0.2323, 0.1768],
          [0.4646, 0.4487]]],


        [[[0.2368, 0.0262],
          [0.2423, 0.9593]]]])
----------------------------------------------
the epoch 0 data: 

tensor([[[[0.5882, 0.5961],
          [0.5451, 0.7333]]],


        [[[0.3529, 0.1843],
          [0.8314, 0.2118]]],


        [[[0.9294, 0.0784],
          [0.4078, 0.9490]]],


        [[[0.6941, 0.2824],
          [0.0863, 0.3725]]]])
tensor([[[[0.7020, 0.2118],
          [0.5922, 0.6235]]],


        [[[0.9804, 0.9961],
          [0.2902, 0.0235]]],


        [[[0.2314, 0.1765],
          [0.4627, 0.4471]]],


        [[[0.0235, 0.2353],
          [0.9569, 0.2392]]]])
----------------------------------------------
the epoch 1 data: 

tensor([[[[0.5882, 0.5961],
          [0.5451, 0.7333]]],


        [[[0.1843, 0.3529],
          [0.2118, 0.8314]]],


        [[[0.0784, 0.9294],
          [0.9490, 0.4078]]],


        [[[0.2824, 0.6941],
          [0.3725, 0.0863]]]])
tensor([[[[0.2118, 0.7020],
          [0.6235, 0.5922]]],


        [[[0.9804, 0.9961],
          [0.2902, 0.0235]]],


        [[[0.2314, 0.1765],
          [0.4627, 0.4471]]],


        [[[0.0235, 0.2353],
          [0.9569, 0.2392]]]])
----------------------------------------------
the epoch 2 data: 

tensor([[[[0.5882, 0.5961],
          [0.5451, 0.7333]]],


        [[[0.3529, 0.1843],
          [0.8314, 0.2118]]],


        [[[0.0784, 0.9294],
          [0.9490, 0.4078]]],


        [[[0.6941, 0.2824],
          [0.0863, 0.3725]]]])
tensor([[[[0.2118, 0.7020],
          [0.6235, 0.5922]]],


        [[[0.9961, 0.9804],
          [0.0235, 0.2902]]],


        [[[0.2314, 0.1765],
          [0.4627, 0.4471]]],


        [[[0.0235, 0.2353],
          [0.9569, 0.2392]]]])

Different epoch we get different outputs!

answered Sep 23 '22 04:09

Aodi Wu

Related questions
                            
                                How to round a numpy array?
                            
                                In production, Apache + mod_wsgi or Nginx + mod_wsgi?
                            
                                Python overwriting variables in nested functions
                            
                                A very simple multithreading parallel URL fetching (without queue)
                            
                                Python "TypeError: unhashable type: 'slice'" for encoding categorical data
                            
                                Creating labels where line appears in matplotlib figure
                            
                                openCV video saving in python
                            
                                How to put legend outside the plot with pandas
                            
                                Calculating the area under a curve given a set of coordinates, without knowing the function
                            
                                Return in generator together with yield in Python 3.3
                            
                                Importing a function from a class in another file?
                            
                                Django delete superuser
                            
                                How to delete a table in SQLAlchemy?
                            
                                How is Lisp's read-eval-print loop different than Python's?
                            
                                Pandas left outer join multiple dataframes on multiple columns
                            
                                Python Datatype for a fixed-length FIFO
                            
                                how to issue a "show dbs" from pymongo
                            
                                Why is a double semicolon a SyntaxError in Python?
                            
                                Is a list (potentially) divisible by another?
                            
                                How to use MinGW's gcc compiler when installing Python package using Pip?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Data Augmentation in PyTorch

Tags:

python

image-processing

dataset

pytorch

data-augmentation