Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use the confusion matrix module in NLTK?

Tags:

python

nlp

nltk

I followed the NLTK book in using the confusion matrix but the confusionmatrix looks very odd.

#empirically exam where tagger is making mistakes
test_tags = [tag for sent in brown.sents(categories='editorial')
    for (word, tag) in t2.tag(sent)]
gold_tags = [tag for (word, tag) in brown.tagged_words(categories='editorial')]
print nltk.ConfusionMatrix(gold_tags, test_tags)

Can anyone explain how to use the confusion matrix?

like image 902
user3314418 Avatar asked Dec 02 '22 19:12

user3314418


2 Answers

Firstly, I assume that you got the code from old NLTK's chapter 05: https://nltk.googlecode.com/svn/trunk/doc/book/ch05.py, particularly you're look at this section: http://pastebin.com/EC8fFqLU

Now, let's look at the confusion matrix in NLTK, try:

from nltk.metrics import ConfusionMatrix
ref  = 'DET NN VB DET JJ NN NN IN DET NN'.split()
tagged = 'DET VB VB DET NN NN NN IN DET NN'.split()
cm = ConfusionMatrix(ref, tagged)
print cm

[out]:

    | D         |
    | E I J N V |
    | T N J N B |
----+-----------+
DET |<3>. . . . |
 IN | .<1>. . . |
 JJ | . .<.>1 . |
 NN | . . .<3>1 |
 VB | . . . .<1>|
----+-----------+
(row = reference; col = test)

The numbers embedded in <> are the true positives (tp). And from the example above, you see that one of the JJ from reference was wrongly tagged as NN from the tagged output. For that instance, it counts as one false positive for NN and one false negative for JJ.

To access the confusion matrix (for calculating precision/recall/fscore), you can access the false negatives, false positives and true positives by:

labels = set('DET NN VB IN JJ'.split())

true_positives = Counter()
false_negatives = Counter()
false_positives = Counter()

for i in labels:
    for j in labels:
        if i == j:
            true_positives[i] += cm[i,j]
        else:
            false_negatives[i] += cm[i,j]
            false_positives[j] += cm[i,j]

print "TP:", sum(true_positives.values()), true_positives
print "FN:", sum(false_negatives.values()), false_negatives
print "FP:", sum(false_positives.values()), false_positives

[out]:

TP: 8 Counter({'DET': 3, 'NN': 3, 'VB': 1, 'IN': 1, 'JJ': 0})
FN: 2 Counter({'NN': 1, 'JJ': 1, 'VB': 0, 'DET': 0, 'IN': 0})
FP: 2 Counter({'VB': 1, 'NN': 1, 'DET': 0, 'JJ': 0, 'IN': 0})

To calculate Fscore per label:

for i in sorted(labels):
    if true_positives[i] == 0:
        fscore = 0
    else:
        precision = true_positives[i] / float(true_positives[i]+false_positives[i])
        recall = true_positives[i] / float(true_positives[i]+false_negatives[i])
        fscore = 2 * (precision * recall) / float(precision + recall)
    print i, fscore

[out]:

DET 1.0
IN 1.0
JJ 0
NN 0.75
VB 0.666666666667

I hope the above will de-confuse the confusion matrix usage in NLTK, here's the full code for the example above:

from collections import Counter
from nltk.metrics import ConfusionMatrix

ref  = 'DET NN VB DET JJ NN NN IN DET NN'.split()
tagged = 'DET VB VB DET NN NN NN IN DET NN'.split()
cm = ConfusionMatrix(ref, tagged)

print cm

labels = set('DET NN VB IN JJ'.split())

true_positives = Counter()
false_negatives = Counter()
false_positives = Counter()

for i in labels:
    for j in labels:
        if i == j:
            true_positives[i] += cm[i,j]
        else:
            false_negatives[i] += cm[i,j]
            false_positives[j] += cm[i,j]

print "TP:", sum(true_positives.values()), true_positives
print "FN:", sum(false_negatives.values()), false_negatives
print "FP:", sum(false_positives.values()), false_positives
print 

for i in sorted(labels):
    if true_positives[i] == 0:
        fscore = 0
    else:
        precision = true_positives[i] / float(true_positives[i]+false_positives[i])
        recall = true_positives[i] / float(true_positives[i]+false_negatives[i])
        fscore = 2 * (precision * recall) / float(precision + recall)
    print i, fscore
like image 91
alvas Avatar answered Jan 22 '23 10:01

alvas


This is a real case of a text classifier, works with sklearn and NLTK

from collections import defaultdict
refsets = defaultdict(set)
testsets = defaultdict(set)
labels = []
tests = []
for i, (feats, label) in enumerate(testset):
    refsets[label].add(i)
    observed = classifier.classify(feats)
    testsets[observed].add(i)
    labels.append(label)
    tests.append(observed)

print(metrics.confusion_matrix(labels, tests))
print(nltk.ConfusionMatrix(labels, tests))

    |   n   p |
    |   e   o |
    |   g   s |
----+---------+
neg |<228> 22 |
pos |  18<232>|
----+---------+
   (row = reference; col = test)

    [[228  22]
     [ 18 232]]
like image 34
Max Kleiner Avatar answered Jan 22 '23 12:01

Max Kleiner