Calculate BLEU score in Python

Question

There is a test sentence and a reference sentence. How can I write a Python script that measures similarity between these two sentences in the form of BLEU metric used in automatic machine translation evaluation?

ccy · Accepted Answer

The BLEU score consists of two parts, modified precision and brevity penalty. Details can be seen in the paper. You can use the nltk.align.bleu_score module inside the NLTK. One code example can be seen as below:

import nltk

hypothesis = ['It', 'is', 'a', 'cat', 'at', 'room']
reference = ['It', 'is', 'a', 'cat', 'inside', 'the', 'room']
#there may be several references
BLEUscore = nltk.translate.bleu_score.sentence_bleu([reference], hypothesis)
print(BLEUscore)

Note that the default BLEU score uses n=4 which includes unigrams to 4 grams. If your sentence is smaller than 4, you need to reset the N value, otherwise ZeroDivisionError: Fraction(0, 0) error will be returned. So, you should reset the weight like this:

import nltk

hypothesis = ["open", "the", "file"]
reference = ["open", "file"]
#the maximum is bigram, so assign the weight into 2 half.
BLEUscore = nltk.translate.bleu_score.sentence_bleu([reference], hypothesis, weights = (0.5, 0.5))
print(BLEUscore)

Calculate BLEU score in Python

Tags:

python

nltk

Alapan Kuila

1 Answers

ccy

Recent Activity

Donate For Us

Calculate BLEU score in Python

Tags:

python

nltk

Alapan Kuila

1 Answers

ccy

Related questions

Recent Activity

Donate For Us