Accuracy score in pyTorch LSTM

Tags:

I have been running this LSTM tutorial on the wikigold.conll NER data set

training_data contains a list of tuples of sequences and tags, for example:

training_data = [
    ("They also have a song called \" wake up \"".split(), ["O", "O", "O", "O", "O", "O", "I-MISC", "I-MISC", "I-MISC", "I-MISC"]),
    ("Major General John C. Scheidt Jr.".split(), ["O", "O", "I-PER", "I-PER", "I-PER"])
]

And I wrote down this function

def predict(indices):
    """Gets a list of indices of training_data, and returns a list of predicted lists of tags"""
    for index in indicies:
        inputs = prepare_sequence(training_data[index][0], word_to_ix)
        tag_scores = model(inputs)
        values, target = torch.max(tag_scores, 1)
        yield target

This way I can get the predicted labels for specific indices in the training data.

However, how do I evaluate the accuracy score across all training data.

Accuracy being, the amount of words correctly classified across all sentences divided by the word count.

This is what I came up with, which is extremely slow and ugly:

y_pred = list(predict([s for s, t in training_data]))
y_true = [t for s, t in training_data]
c=0
s=0
for i in range(len(training_data)):
    n = len(y_true[i])
    #super ugly and ineffiicient
    s+=(sum(sum(list(y_true[i].view(-1, n) == y_pred[i].view(-1, n).data))))
    c+=n

print ('Training accuracy:{a}'.format(a=float(s)/c))

How can this be done efficiently in pytorch ?

P.S: I've been trying to use sklearn's accuracy_score unsuccessfully

275

asked May 14 '17 09:05

Uri Goren

1 Answers

I would use numpy in order to not iterate the list in pure python.

The results are the same, but it runs much faster

def accuracy_score(y_true, y_pred):
    y_pred = np.concatenate(tuple(y_pred))
    y_true = np.concatenate(tuple([[t for t in y] for y in y_true])).reshape(y_pred.shape)
    return (y_true == y_pred).sum() / float(len(y_true))

And this is how to use it:

#original code:
y_pred = list(predict([s for s, t in training_data]))
y_true = [t for s, t in training_data]
#numpy accuracy score
print(accuracy_score(y_true, y_pred))

answered Sep 19 '22 17:09

GregA

Related questions
                            
                                Why does PIP convert underscores to dashes
                            
                                Can one upload files using Python SimpleHTTPServer or cgi?
                            
                                How to prevent adding two arrays by broadcasting in numpy?
                            
                                Efficient k-means evaluation with silhouette score in sklearn
                            
                                How to exit the script in a unittest test case
                            
                                Python theano with index computed inside the loop
                            
                                Calling Scrapy from another file without threading
                            
                                get playing wav audio level as output
                            
                                How to move the mouse in Selenium?
                            
                                How to use scikit's preprocessing/normalization along with cross validation?
                            
                                Why is Parsimonious rejecting my input with an IncompleteParseError?
                            
                                Python Requests - retry request after re-authentication
                            
                                Running scipy.integrate.ode in multiprocessing Pool results in huge performance hit
                            
                                How to break conversation data into pairs of (Context , Response)
                            
                                How to draw a proper grid on PyQt?
                            
                                Loading a pyspark ML model in a non-Spark environment
                            
                                Python doctest: skip a test conditionally
                            
                                Is there a Windows equivalent to PyVirtualDisplay
                            
                                How do numpy functions operate on pandas objects internally?
                            
                                MySQL OperationalError when running a Flask server (Apache) for some days

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Accuracy score in pyTorch LSTM

Tags:

python

deep-learning

pytorch

scikit-learn