How to get mini-batches in pytorch in a clean and efficient way?

Tags:

I was trying to do a simple thing which was train a linear model with Stochastic Gradient Descent (SGD) using torch:

import numpy as np  import torch from torch.autograd import Variable  import pdb  def get_batch2(X,Y,M,dtype):     X,Y = X.data.numpy(), Y.data.numpy()     N = len(Y)     valid_indices = np.array( range(N) )     batch_indices = np.random.choice(valid_indices,size=M,replace=False)     batch_xs = torch.FloatTensor(X[batch_indices,:]).type(dtype)     batch_ys = torch.FloatTensor(Y[batch_indices]).type(dtype)     return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)  def poly_kernel_matrix( x,D ):     N = len(x)     Kern = np.zeros( (N,D+1) )     for n in range(N):         for d in range(D+1):             Kern[n,d] = x[n]**d;     return Kern  ## data params N=5 # data set size Degree=4 # number dimensions/features D_sgd = Degree+1 ## x_true = np.linspace(0,1,N) # the real data points y = np.sin(2*np.pi*x_true) y.shape = (N,1) ## TORCH dtype = torch.FloatTensor # dtype = torch.cuda.FloatTensor # Uncomment this to run on GPU X_mdl = poly_kernel_matrix( x_true,Degree ) X_mdl = Variable(torch.FloatTensor(X_mdl).type(dtype), requires_grad=False) y = Variable(torch.FloatTensor(y).type(dtype), requires_grad=False) ## SGD mdl w_init = torch.zeros(D_sgd,1).type(dtype) W = Variable(w_init, requires_grad=True) M = 5 # mini-batch size eta = 0.1 # step size for i in range(500):     batch_xs, batch_ys = get_batch2(X_mdl,y,M,dtype)     # Forward pass: compute predicted y using operations on Variables     y_pred = batch_xs.mm(W)     # Compute and print loss using operations on Variables. Now loss is a Variable of shape (1,) and loss.data is a Tensor of shape (1,); loss.data[0] is a scalar value holding the loss.     loss = (1/N)*(y_pred - batch_ys).pow(2).sum()     # Use autograd to compute the backward pass. Now w will have gradients     loss.backward()     # Update weights using gradient descent; w1.data are Tensors,     # w.grad are Variables and w.grad.data are Tensors.     W.data -= eta * W.grad.data     # Manually zero the gradients after updating weights     W.grad.data.zero_()  # c_sgd = W.data.numpy() X_mdl = X_mdl.data.numpy() y = y.data.numpy() # Xc_pinv = np.dot(X_mdl,c_sgd) print('J(c_sgd) = ', (1/N)*(np.linalg.norm(y-Xc_pinv)**2) ) print('loss = ',loss.data[0])

the code runs fine and all though my get_batch2 method seems really dum/naive, its probably because I am new to pytorch but I have not found a good place where they discuss how to retrieve data batches. I went through their tutorials (http://pytorch.org/tutorials/beginner/pytorch_with_examples.html) and through the data set (http://pytorch.org/tutorials/beginner/data_loading_tutorial.html) with no luck. The tutorials all seem to assume that one already has the batch and batch-size at the beginning and then proceeds to train with that data without changing it (specifically look at http://pytorch.org/tutorials/beginner/pytorch_with_examples.html#pytorch-variables-and-autograd).

So my question is do I really need to turn my data back into numpy so that I can fetch some random sample of it and then turn it back to pytorch with Variable to be able to train in memory? Is there no way to get mini-batches with torch?

I looked at a few functions torch provides but with no luck:

#pdb.set_trace() #valid_indices = torch.arange(0,N).numpy() #valid_indices = np.array( range(N) ) #batch_indices = np.random.choice(valid_indices,size=M,replace=False) #indices = torch.LongTensor(batch_indices) #batch_xs, batch_ys = torch.index_select(X_mdl, 0, indices), torch.index_select(y, 0, indices) #batch_xs,batch_ys = torch.index_select(X_mdl, 0, indices), torch.index_select(y, 0, indices)

even though the code I provided works fine I am worried that its not an efficient implementation AND that if I were to use GPUs that there would be a considerable further slow down (because my guess it putting things in memory and then fetching them back to put them GPU like that is silly).

I implemented a new one based on the answer that suggested to use torch.index_select():

def get_batch2(X,Y,M):     '''     get batch for pytorch model     '''     # TODO fix and make it nicer, there is pytorch forum question     #X,Y = X.data.numpy(), Y.data.numpy()     X,Y = X, Y     N = X.size()[0]     batch_indices = torch.LongTensor( np.random.randint(0,N+1,size=M) )     pdb.set_trace()     batch_xs = torch.index_select(X,0,batch_indices)     batch_ys = torch.index_select(Y,0,batch_indices)     return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

however, this seems to have issues because it does not work if X,Y are NOT variables...which is really odd. I added this to the pytorch forum: https://discuss.pytorch.org/t/how-to-get-mini-batches-in-pytorch-in-a-clean-and-efficient-way/10322

Right now what I am struggling with is making this work for gpu. My most current version:

def get_batch2(X,Y,M,dtype):     '''     get batch for pytorch model     '''     # TODO fix and make it nicer, there is pytorch forum question     #X,Y = X.data.numpy(), Y.data.numpy()     X,Y = X, Y     N = X.size()[0]     if dtype ==  torch.cuda.FloatTensor:         batch_indices = torch.cuda.LongTensor( np.random.randint(0,N,size=M) )# without replacement     else:         batch_indices = torch.LongTensor( np.random.randint(0,N,size=M) ).type(dtype)  # without replacement     pdb.set_trace()     batch_xs = torch.index_select(X,0,batch_indices)     batch_ys = torch.index_select(Y,0,batch_indices)     return Variable(batch_xs, requires_grad=False), Variable(batch_ys, requires_grad=False)

the error:

RuntimeError: tried to construct a tensor from a int sequence, but found an item of type numpy.int64 at index (0)

I don't get it, do I really have to do:

ints = [ random.randint(0,N) for i i range(M)]

to get the integers?

It would also be ideal if the data could be a variable. It seems that it torch.index_select does not work for Variable type data.

this list of integers thing still doesn't work:

TypeError: torch.addmm received an invalid combination of arguments - got (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor), but expected one of:  * (torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)  * (torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)  * (float beta, torch.cuda.FloatTensor source, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)  * (torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)  * (float beta, torch.cuda.FloatTensor source, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)  * (torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)  * (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)       didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)  * (float beta, torch.cuda.FloatTensor source, float alpha, torch.cuda.sparse.FloatTensor mat1, torch.cuda.FloatTensor mat2, *, torch.cuda.FloatTensor out)       didn't match because some of the arguments have invalid types: (int, torch.cuda.FloatTensor, int, torch.cuda.FloatTensor, torch.FloatTensor, out=torch.cuda.FloatTensor)

951

asked Jul 15 '17 00:07

Charlie Parker

2 Answers

If I'm understanding your code correctly, your get_batch2 function appears to be taking random mini-batches from your dataset without tracking which indices you've used already in an epoch. The issue with this implementation is that it likely will not make use of all of your data.

The way I usually do batching is creating a random permutation of all the possible vertices using torch.randperm(N) and loop through them in batches. For example:

n_epochs = 100 # or whatever batch_size = 128 # or whatever  for epoch in range(n_epochs):      # X is a torch Variable     permutation = torch.randperm(X.size()[0])      for i in range(0,X.size()[0], batch_size):         optimizer.zero_grad()          indices = permutation[i:i+batch_size]         batch_x, batch_y = X[indices], Y[indices]          # in case you wanted a semi-full example         outputs = model.forward(batch_x)         loss = lossfunction(outputs,batch_y)          loss.backward()         optimizer.step()

If you like to copy and paste, make sure you define your optimizer, model, and lossfunction somewhere before the start of the epoch loop.

With regards to your error, try using torch.from_numpy(np.random.randint(0,N,size=M)).long() instead of torch.LongTensor(np.random.randint(0,N,size=M)). I'm not sure if this will solve the error you are getting, but it will solve a future error.

answered Oct 13 '22 16:10

saetch_g

Use data loaders.

Data Set

First you define a dataset. You can use packages datasets in torchvision.datasets or use ImageFolder dataset class which follows the structure of Imagenet.

trainset=torchvision.datasets.ImageFolder(root='/path/to/your/data/trn', transform=generic_transform) testset=torchvision.datasets.ImageFolder(root='/path/to/your/data/val', transform=generic_transform)

Transforms

Transforms are very useful for preprocessing loaded data on the fly. If you are using images, you have to use the ToTensor() transform to convert loaded images from PIL to torch.tensor. More transforms can be packed into a composit transform as follows.

generic_transform = transforms.Compose([     transforms.ToTensor(),     transforms.ToPILImage(),     #transforms.CenterCrop(size=128),     transforms.Lambda(lambda x: myimresize(x, (128, 128))),     transforms.ToTensor(),     transforms.Normalize((0., 0., 0.), (6, 6, 6)) ])

Data Loader

Then you define a data loader which prepares the next batch while training. You can set number of threads for data loading.

trainloader=torch.utils.data.DataLoader(trainset, batch_size=32, shuffle=True, num_workers=8) testloader=torch.utils.data.DataLoader(testset, batch_size=32, shuffle=False, num_workers=8)

For training, you just enumerate on the data loader.

  for i, data in enumerate(trainloader, 0):     inputs, labels = data         inputs, labels = Variable(inputs.cuda()), Variable(labels.cuda())     # continue training...

NumPy Stuff

Yes. You have to convert torch.tensor to numpy using .numpy() method to work on it. If you are using CUDA you have to download the data from GPU to CPU first using the .cpu() method before calling .numpy(). Personally, coming from MATLAB background, I prefer to do most of the work with torch tensor, then convert data to numpy only for visualisation. Also bear in mind that torch stores data in a channel-first mode while numpy and PIL work with channel-last. This means you need to use np.rollaxis to move the channel axis to the last. A sample code is below.

np.rollaxis(make_grid(mynet.ftrextractor(inputs).data, nrow=8, padding=1).cpu().numpy(), 0, 3)

Logging

The best method I found to visualise the feature maps is using tensor board. A code is available at yunjey/pytorch-tutorial.

answered Oct 13 '22 18:10

Mo Hossny

Related questions
                            
                                Python: Regular expression to match alpha-numeric not working?
                            
                                Extract / Identify Tables from PDF python [closed]
                            
                                Save base64 image in django file field
                            
                                Replace whole string if it contains substring in pandas
                            
                                Suppress the u'prefix indicating unicode' in python strings
                            
                                Are Generators Threadsafe?
                            
                                Adding model-wide help text to a django model's admin form
                            
                                Python MySQLDB: Get the result of fetchall in a list
                            
                                List append() in for loop [duplicate]
                            
                                Are there any alternatives to py2exe? [closed]
                            
                                Reduce a key-value pair into a key-list pair with Apache Spark
                            
                                Tqdm 4.28.1 in Jupyter Notebook "IntProgress not found. Please update jupyter and ipywidgets."
                            
                                A way to output pyunit test name in setup()
                            
                                Python pip install module is not found. How to link python to pip location?
                            
                                How to set up a Django project in PyCharm
                            
                                AttributeError: module 'tensorflow' has no attribute 'reset_default_graph'
                            
                                Get date from ISO week number in Python [duplicate]
                            
                                How do you set the absolute position of figure windows with matplotlib?
                            
                                How do I solve overfitting in random forest of Python sklearn?
                            
                                Error installing Pillow on ubuntu 14.04

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to get mini-batches in pytorch in a clean and efficient way?

Tags:

python

machine-learning

numpy

deep-learning

pytorch