I followed the NLTK book in using the confusion matrix but the confusionmatrix looks very odd.
#empirically exam where tagger is making mistakes
test_tags = [tag for sent in brown.sents(categories='editorial')
for (word, tag) in t2.tag(sent)]
gold_tags = [tag for (word, tag) in brown.tagged_words(categories='editorial')]
print nltk.ConfusionMatrix(gold_tags, test_tags)
Can anyone explain how to use the confusion matrix?
Firstly, I assume that you got the code from old NLTK
's chapter 05: https://nltk.googlecode.com/svn/trunk/doc/book/ch05.py, particularly you're look at this section: http://pastebin.com/EC8fFqLU
Now, let's look at the confusion matrix in NLTK
, try:
from nltk.metrics import ConfusionMatrix
ref = 'DET NN VB DET JJ NN NN IN DET NN'.split()
tagged = 'DET VB VB DET NN NN NN IN DET NN'.split()
cm = ConfusionMatrix(ref, tagged)
print cm
[out]:
| D |
| E I J N V |
| T N J N B |
----+-----------+
DET |<3>. . . . |
IN | .<1>. . . |
JJ | . .<.>1 . |
NN | . . .<3>1 |
VB | . . . .<1>|
----+-----------+
(row = reference; col = test)
The numbers embedded in <>
are the true positives (tp). And from the example above, you see that one of the JJ
from reference was wrongly tagged as NN
from the tagged output. For that instance, it counts as one false positive for NN
and one false negative for JJ
.
To access the confusion matrix (for calculating precision/recall/fscore), you can access the false negatives, false positives and true positives by:
labels = set('DET NN VB IN JJ'.split())
true_positives = Counter()
false_negatives = Counter()
false_positives = Counter()
for i in labels:
for j in labels:
if i == j:
true_positives[i] += cm[i,j]
else:
false_negatives[i] += cm[i,j]
false_positives[j] += cm[i,j]
print "TP:", sum(true_positives.values()), true_positives
print "FN:", sum(false_negatives.values()), false_negatives
print "FP:", sum(false_positives.values()), false_positives
[out]:
TP: 8 Counter({'DET': 3, 'NN': 3, 'VB': 1, 'IN': 1, 'JJ': 0})
FN: 2 Counter({'NN': 1, 'JJ': 1, 'VB': 0, 'DET': 0, 'IN': 0})
FP: 2 Counter({'VB': 1, 'NN': 1, 'DET': 0, 'JJ': 0, 'IN': 0})
To calculate Fscore per label:
for i in sorted(labels):
if true_positives[i] == 0:
fscore = 0
else:
precision = true_positives[i] / float(true_positives[i]+false_positives[i])
recall = true_positives[i] / float(true_positives[i]+false_negatives[i])
fscore = 2 * (precision * recall) / float(precision + recall)
print i, fscore
[out]:
DET 1.0
IN 1.0
JJ 0
NN 0.75
VB 0.666666666667
I hope the above will de-confuse the confusion matrix usage in NLTK
, here's the full code for the example above:
from collections import Counter
from nltk.metrics import ConfusionMatrix
ref = 'DET NN VB DET JJ NN NN IN DET NN'.split()
tagged = 'DET VB VB DET NN NN NN IN DET NN'.split()
cm = ConfusionMatrix(ref, tagged)
print cm
labels = set('DET NN VB IN JJ'.split())
true_positives = Counter()
false_negatives = Counter()
false_positives = Counter()
for i in labels:
for j in labels:
if i == j:
true_positives[i] += cm[i,j]
else:
false_negatives[i] += cm[i,j]
false_positives[j] += cm[i,j]
print "TP:", sum(true_positives.values()), true_positives
print "FN:", sum(false_negatives.values()), false_negatives
print "FP:", sum(false_positives.values()), false_positives
print
for i in sorted(labels):
if true_positives[i] == 0:
fscore = 0
else:
precision = true_positives[i] / float(true_positives[i]+false_positives[i])
recall = true_positives[i] / float(true_positives[i]+false_negatives[i])
fscore = 2 * (precision * recall) / float(precision + recall)
print i, fscore
This is a real case of a text classifier, works with sklearn and NLTK
from collections import defaultdict
refsets = defaultdict(set)
testsets = defaultdict(set)
labels = []
tests = []
for i, (feats, label) in enumerate(testset):
refsets[label].add(i)
observed = classifier.classify(feats)
testsets[observed].add(i)
labels.append(label)
tests.append(observed)
print(metrics.confusion_matrix(labels, tests))
print(nltk.ConfusionMatrix(labels, tests))
| n p |
| e o |
| g s |
----+---------+
neg |<228> 22 |
pos | 18<232>|
----+---------+
(row = reference; col = test)
[[228 22]
[ 18 232]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With