I have a huge list of numpy arrays, where each array represents an image and I want to load it using torch.utils.data.Dataloader object. But the documentation of torch.utils.data.Dataloader mentions that it loads data directly from a folder. How do I modify it for my cause? I am new to pytorch and any help would be greatly appreciated. my numpy array for a single image looks something like this. The image is RBG image.
[[[ 70 82 94] [ 67 81 93] [ 66 82 94] ..., [182 182 188] [183 183 189] [188 186 192]] [[ 66 80 92] [ 62 78 91] [ 64 79 95] ..., [176 176 182] [178 178 184] [180 180 186]] [[ 62 82 93] [ 62 81 96] [ 65 80 99] ..., [169 172 177] [173 173 179] [172 172 178]] ...,
To input a NumPy array to a neural network in PyTorch, you need to convert numpy. array to torch. Tensor .
The Dataset class is an abstract class that is used to define new types of (customs) datasets. Instead, the TensorDataset is a ready to use class to represent your data as list of tensors.
Dataset stores the samples and their corresponding labels, and DataLoader wraps an iterable around the Dataset to enable easy access to the samples.
I think what DataLoader actually requires is an input that subclasses Dataset
. You can either write your own dataset class that subclasses Dataset
or use TensorDataset
as I have done below:
import torch import numpy as np from torch.utils.data import TensorDataset, DataLoader my_x = [np.array([[1.0,2],[3,4]]),np.array([[5.,6],[7,8]])] # a list of numpy arrays my_y = [np.array([4.]), np.array([2.])] # another list of numpy arrays (targets) tensor_x = torch.Tensor(my_x) # transform to torch tensor tensor_y = torch.Tensor(my_y) my_dataset = TensorDataset(tensor_x,tensor_y) # create your datset my_dataloader = DataLoader(my_dataset) # create your dataloader
Works for me. Hope it helps you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With