So I'm very new to PyTorch and Neural Networks in general, and I'm having some problems creating a Neural Network that classifies names by gender.
I based this off of the PyTorch tutorial for RNNs that classify names by nationality, but I decided not to go with a recurrent approach... Stop me right here if this was the wrong idea!
However, whenever I try to run an input through the network it tells me:
RuntimeError: matrices expected, got 3D, 2D tensors at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensorMath.c:1232
I know this has something to do with how PyTorch always expects there to be a batch size or something, and I have my tensor set up that way, but you can probably tell by this point that I have no idea what I'm talking about. Here's my code:
from future import unicode_literals, print_function, division
from io import open
import glob
import unicodedata
import string
import torch
import torchvision
import torch.nn as nn
import torch.optim as optim
import random
from torch.autograd import Variable
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
"""------GLOBAL VARIABLES------"""
all_letters = string.ascii_letters + " .,;'"
num_letters = len(all_letters)
all_names = {}
genders = ["Female", "Male"]
"""-------DATA EXTRACTION------"""
def findFiles(path):
return glob.glob(path)
def unicodeToAscii(s):
return ''.join(
c for c in unicodedata.normalize('NFD', s)
if unicodedata.category(c) != 'Mn'
and c in all_letters
)
# Read a file and split into lines
def readLines(filename):
lines = open(filename, encoding='utf-8').read().strip().split('\n')
return [unicodeToAscii(line) for line in lines]
for file in findFiles("/home/andrew/PyCharm/PycharmProjects/CantStop/data/names/*.txt"):
gender = file.split("/")[-1].split(".")[0]
names = readLines(file)
all_names[gender] = names
"""-----DATA INTERPRETATION-----"""
def nameToTensor(name):
tensor = torch.zeros(len(name), 1, num_letters)
for index, letter in enumerate(name):
tensor[index][0][all_letters.find(letter)] = 1
return tensor
def outputToGender(output):
gender, gender_index = output.data.topk(1)
if gender_index[0][0] == 0:
return "Female"
return "Male"
"""------NETWORK SETUP------"""
class Net(nn.Module):
def __init__(self, input_size, output_size):
super(Net, self).__init__()
#Layer 1
self.Lin1 = nn.Linear(input_size, int(input_size/2))
self.ReLu1 = nn.ReLU()
self.Batch1 = nn.BatchNorm1d(int(input_size/2))
#Layer 2
self.Lin2 = nn.Linear(int(input_size/2), output_size)
self.ReLu2 = nn.ReLU()
self.Batch2 = nn.BatchNorm1d(output_size)
self.softMax = nn.LogSoftmax()
def forward(self, input):
output1 = self.Batch1(self.ReLu1(self.Lin1(input)))
output2 = self.softMax(self.Batch2(self.ReLu2(self.Lin2(output1))))
return output2
NN = Net(num_letters, 2)
"""------TRAINING------"""
def getRandomTrainingEx():
gender = genders[random.randint(0, 1)]
name = all_names[gender][random.randint(0, len(all_names[gender])-1)]
gender_tensor = Variable(torch.LongTensor([genders.index(gender)]))
name_tensor = Variable(nameToTensor(name))
return gender_tensor, name_tensor, gender
def train(input, target):
loss_func = nn.NLLLoss()
optimizer = optim.SGD(NN.parameters(), lr=0.0001, momentum=0.9)
optimizer.zero_grad()
output = NN(input)
loss = loss_func(output, target)
loss.backward()
optimizer.step()
return output, loss
all_losses = []
current_loss = 0
for i in range(100000):
gender_tensor, name_tensor, gender = getRandomTrainingEx()
output, loss = train(name_tensor, gender_tensor)
current_loss += loss
if i%1000 == 0:
print("Guess: %s, Correct: %s, Loss: %s" % (outputToGender(output), gender, loss.data[0]))
if i%100 == 0:
all_losses.append(current_loss/10)
current_loss = 0
# plt.figure()
# plt.plot(all_losses)
# plt.show()
Please help a newbie out!
Pycharm is a helpful python debugger that let you set breakpoint and views dimension of your tensor.
For easier debug, do not stack forward thing up like that
output1 = self.Batch1(self.ReLu1(self.Lin1(input)))
Instead,
h1 = self.ReLu1(self.Lin1(input))
h2 = self.Batch1(h1)
For the stacktrace, Pytorch also provide Pythonic error stacktrack. I believe that before
RuntimeError: matrices expected, got 3D, 2D tensors at /py/conda-bld/pytorch_1493681908901/work/torch/lib/TH/generic/THTensorMath.c:1232
There are some python error stacktrace that point right into your code. For easier debug, as I said, don't stack forward.
You use Pycharm to create break point before crash point. In debugger watcher Then use Variable(torch.rand(dim1, dim2))
to test out forward pass input, output dimension, and if a dimension is incorrect. Comparing with dimension of input. Call input.size()
in debugger watcher.
For example, self.ReLu1(self.Lin1(Variable(torch.rand(10, 20)))).size()
. If it show read text (error), then the input dimension is incorrect. Else, it show the size of the output.
In Pytorch Docs, it specify input/output dimension. It also have a example code snip
>>> rnn = nn.RNN(10, 20, 2)
>>> input = Variable(torch.randn(5, 3, 10))
>>> h0 = Variable(torch.randn(2, 3, 20))
>>> output, hn = rnn(input, h0)
You may use the code snip in PyCharm Debugger to explore dimension of input, output of specific layer of your interest (RNN, Linear, BatchNorm1d).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With