The question is about the data loading tutorial from the PyTorch website. I don't know how they write the value of mean_pix
and std_pix
of the in transforms.Normalize without calculation
I'm unable to find any explanation relevant to this question on StackOverflow.
import torch
from torchvision import transforms, datasets
data_transform = transforms.Compose([
transforms.RandomSizedCrop(224),
transforms.RandomHorizontalFlip(),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
hymenoptera_dataset = datasets.ImageFolder(root='hymenoptera_data/train',
transform=data_transform)
dataset_loader = torch.utils.data.DataLoader(hymenoptera_dataset,
batch_size=4, shuffle=True,
num_workers=4)
The value mean=[0.485,0.456, 0.406]
and std=[0.229, 0.224, 0.225]
is not obvious to me. How do they get them? And why are they equal to these?
The data can be normalized by subtracting the mean (µ) of each feature and a division by the standard deviation (σ). This way, each feature has a mean of 0 and a standard deviation of 1.
mean: simply divide the sum of pixel values by the total count - number of pixels in the dataset computed as len(df) * image_size * image_size. standard deviation: use the following equation: total_std = sqrt(psum_sq / count - total_mean ** 2)
For normalization input[channel] = (input[channel] - mean[channel]) / std[channel]
, the mean and standard deviation values are to be taken from the training dataset.
Here, mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225] are the mean and std of Imagenet dataset.
On Imagenet, we’ve done a pass on the dataset and calculated per-channel mean/std. check here
The pre-trained models available in torchvision
for transfer learning were pretrained on Imagenet, so using its mean and std deviation would be fine for fine-tuning your model.
If you're trying to train your model from scratch, it would be better to use the mean and std deviation of your training dataset (face dataset in this case). Other than that, in most of the cases, the mean and std of Imagenet suffice for your problem.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With