Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating Precision, Recall and F-score in one pass - python

Accuracy, precision, recall and f-score are measures of a system quality in machine-learning systems. It depends on a confusion matrix of True/False Positives/Negatives.

Given a binary classification task, I have tried the following to get a function that returns accuracy, precision, recall and f-score:

gold = [1] + [0] * 9
predicted = [1] * 10

def evaluation(gold, predicted):
  true_pos = sum(1 for p,g in zip(predicted, gold) if p==1 and g==1)
  true_neg = sum(1 for p,g in zip(predicted, gold) if p==0 and g==0)
  false_pos = sum(1 for p,g in zip(predicted, gold) if p==1 and g==0)
  false_neg = sum(1 for p,g in zip(predicted, gold) if p==0 and g==1)
  try:
    recall = true_pos / float(true_pos + false_neg)
  except:
    recall = 0
  try:
    precision = true_pos / float(true_pos + false_pos)
  except:
    precision = 0
  try:
    fscore = 2*precision*recall / (precision + recall)
  except:
    fscore = 0
  try:
    accuracy = (true_pos + true_neg) / float(len(gold))
  except:
    accuracy = 0
  return accuracy, precision, recall, fscore

But it seems like I have redundantly looped through the dataset 4 times to get the True/False Positives/Negatives.

Also the multiple try-excepts to catch the ZeroDivisionError is a little redundant.

So what is the pythonic way to get the counts of the True/False Positives/Negatives without multiple loops through the dataset?

How do I pythonically catch the ZeroDivisionError without the multiple try-excepts?


I could also do the following to count the True/False Positives/Negatives in one loop but is there an alternative way without the multiple if?:

for p,g in zip(predicted, gold):
    if p==1 and g==1:
        true_pos+=1
    if p==0 and g==0:
        true_neg+=1
    if p==1 and g==0:
        false_pos+=1
    if p==0 and g==1:
        false_neg+=1
like image 394
alvas Avatar asked Nov 13 '15 09:11

alvas


People also ask

How do you calculate precision and F score recall?

For example, a perfect precision and recall score would result in a perfect F-Measure score: F-Measure = (2 * Precision * Recall) / (Precision + Recall) F-Measure = (2 * 1.0 * 1.0) / (1.0 + 1.0) F-Measure = (2 * 1.0) / 2.0.

How do you calculate precision and recall for multi class problems in Python?

Precision = TP / (TP+FP) Recall = TP / (TP+FN)

How do you get a F1 precision recall score?

For example, a Precision of 0.01 and Recall of 1.0 would give : an arithmetic mean of (0.01+1.0)/2=0.505, F1-score score (formula above) of 2*(0.01*1.0)/(0.01+1.0)=~0.02.


2 Answers

what is the pythonic way to get the counts of the True/False Positives/Negatives without multiple loops through the dataset?

I would use a collections.Counter, roughly what you're doing with all of the ifs (you should be using elifs, as your conditions are mutually exclusive) at the end:

counts = Counter(zip(predicted, gold))

Then e.g. true_pos = counts[1, 1].

How do I pythonically catch the ZeroDivisionError without the multiple try-excepts?

For a start, you should (almost) never use a bare except:. If you're catching ZeroDivisionErrors, then write except ZeroDivisionError. You could also consider a "look before you leap" approach, checking whether the denominator is 0 before trying the division, e.g.

accuracy = (true_pos + true_neg) / float(len(gold)) if gold else 0
like image 71
jonrsharpe Avatar answered Sep 18 '22 01:09

jonrsharpe


This is a pretty natural use case for the bitarray package.

import bitarray as bt

tp = (bt.bitarray(p) & bt.bitarray(g)).count()
tn = (~bt.bitarray(p) & ~bt.bitarray(g)).count()
fp = (bt.bitarray(p) & ~bt.bitarray(g)).count()
fn = (~bt.bitarray(p) & bt.bitarray(g)).count()

There's some type conversion overhead, but after that, the bitwise operations are much faster.

For 100 instances, timeit on my PC gives 0.036 for your method and 0.017 using bitarray at 1000 passes. For 1000 instances, it goes to 0.291 and 0.093. For 10000, 3.177 and 0.863. You get the idea.

It scales pretty well, using no loops, and doesn't have to store a large intermediate representation building a temporary list of tuples in zip.

like image 45
Adam Acosta Avatar answered Sep 19 '22 01:09

Adam Acosta