Calculating Precision, Recall and F-score in one pass - python

Tags:

Accuracy, precision, recall and f-score are measures of a system quality in machine-learning systems. It depends on a confusion matrix of True/False Positives/Negatives.

Given a binary classification task, I have tried the following to get a function that returns accuracy, precision, recall and f-score:

gold = [1] + [0] * 9
predicted = [1] * 10

def evaluation(gold, predicted):
  true_pos = sum(1 for p,g in zip(predicted, gold) if p==1 and g==1)
  true_neg = sum(1 for p,g in zip(predicted, gold) if p==0 and g==0)
  false_pos = sum(1 for p,g in zip(predicted, gold) if p==1 and g==0)
  false_neg = sum(1 for p,g in zip(predicted, gold) if p==0 and g==1)
  try:
    recall = true_pos / float(true_pos + false_neg)
  except:
    recall = 0
  try:
    precision = true_pos / float(true_pos + false_pos)
  except:
    precision = 0
  try:
    fscore = 2*precision*recall / (precision + recall)
  except:
    fscore = 0
  try:
    accuracy = (true_pos + true_neg) / float(len(gold))
  except:
    accuracy = 0
  return accuracy, precision, recall, fscore

But it seems like I have redundantly looped through the dataset 4 times to get the True/False Positives/Negatives.

Also the multiple try-excepts to catch the ZeroDivisionError is a little redundant.

So what is the pythonic way to get the counts of the True/False Positives/Negatives without multiple loops through the dataset?

How do I pythonically catch the ZeroDivisionError without the multiple try-excepts?

I could also do the following to count the True/False Positives/Negatives in one loop but is there an alternative way without the multiple if?:

for p,g in zip(predicted, gold):
    if p==1 and g==1:
        true_pos+=1
    if p==0 and g==0:
        true_neg+=1
    if p==1 and g==0:
        false_pos+=1
    if p==0 and g==1:
        false_neg+=1

394

asked Nov 13 '15 09:11

alvas

2 Answers

what is the pythonic way to get the counts of the True/False Positives/Negatives without multiple loops through the dataset?

I would use a collections.Counter, roughly what you're doing with all of the ifs (you should be using elifs, as your conditions are mutually exclusive) at the end:

counts = Counter(zip(predicted, gold))

Then e.g. true_pos = counts[1, 1].

How do I pythonically catch the ZeroDivisionError without the multiple try-excepts?

For a start, you should (almost) never use a bare except:. If you're catching ZeroDivisionErrors, then write except ZeroDivisionError. You could also consider a "look before you leap" approach, checking whether the denominator is 0 before trying the division, e.g.

accuracy = (true_pos + true_neg) / float(len(gold)) if gold else 0

answered Sep 18 '22 01:09

jonrsharpe

This is a pretty natural use case for the bitarray package.

import bitarray as bt

tp = (bt.bitarray(p) & bt.bitarray(g)).count()
tn = (~bt.bitarray(p) & ~bt.bitarray(g)).count()
fp = (bt.bitarray(p) & ~bt.bitarray(g)).count()
fn = (~bt.bitarray(p) & bt.bitarray(g)).count()

There's some type conversion overhead, but after that, the bitwise operations are much faster.

For 100 instances, timeit on my PC gives 0.036 for your method and 0.017 using bitarray at 1000 passes. For 1000 instances, it goes to 0.291 and 0.093. For 10000, 3.177 and 0.863. You get the idea.

It scales pretty well, using no loops, and doesn't have to store a large intermediate representation building a temporary list of tuples in zip.

answered Sep 19 '22 01:09

Adam Acosta

Related questions
                            
                                remove widgets from grid in tkinter
                            
                                Auto reloading Flask app when source code changes
                            
                                Python handling username and password for URL
                            
                                Redirect after login simply appends LOGIN_REDIRECT_URL
                            
                                BeautifulSoup responses with error
                            
                                Python requests module connection timeout
                            
                                How to dump json without quotes in python
                            
                                Python write create file directly in FTP
                            
                                python all possible combinations of 0,1 of length k
                            
                                Django serialize multiple objects in one call
                            
                                Remove row with all NaN from DataFrame in pandas
                            
                                Splitting a List inside a Pandas DataFrame
                            
                                Django UpdateView / ImageField issue: not returning new uploaded image
                            
                                Import javascript files with jinja from static folder [duplicate]
                            
                                Move radial tick labels on a polar plot in matplotlib
                            
                                Unsupported lookup 'istartwith' for CharField or join on the field not permitted
                            
                                Python Input Sanitization
                            
                                Python import error :No module named Fabric.api?
                            
                                pandas v0.17.0: AttributeError: 'unicode' object has no attribute 'version'
                            
                                Python popen() - communicate( str.encode(encoding="utf-8", errors="ignore") ) crashes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Calculating Precision, Recall and F-score in one pass - python

Tags:

python

list

machine-learning

try-except

precision-recall

alvas

People also ask

2 Answers

jonrsharpe

Adam Acosta

Recent Activity

Donate For Us