Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Interpret NLTK Brill Tagger Rules

For the generated Brill Tagger Rule:

Rule('016', 'CS', 'QL', [(Word([1, 2, 3]),'as')])

I know: 'CS' is subordinating conjunction 'QL' is qualifier

I guess: [(Word([1, 2, 3]),'as')] means the condition of the rule. It stands for the word 'as' appear as the first, second or third position before the target word. Target word is word that is going to be tagged by POS tag.

I do not know: What is the meaning for '016'? How to interpret the rule as a whole?

like image 586
dongx Avatar asked Nov 27 '25 10:11

dongx


1 Answers

The documentation for the rules is here. 016 would be the the templateid, i.e. the template that was used to create the rule. You can also get a description for the rule:

q = Rule('016', 'CS', 'QL', [(Word([1, 2, 3]),'as')])
q.format('verbose')
'CS -> QL if the Word of words i+1...i+3 is "as"'

In this case it is actually the words that come after the target word. (Indicated by i+1...)

like image 157
b3000 Avatar answered Nov 30 '25 06:11

b3000



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!