Can anyone point out the algorithm(s) used by openNLP NameFinder module? The code is complex and only sparsely documented and playing with it as a black box (with the default model provided) gives me the impression that it is mostly heuristic. Here are some examples for input and output:
Input:
John Smith is frustrated.
john smith is frustrated.
Barak Obama is frustrated.
Hugo Chavez is frustrated. (no more)
Jeff Atwood is frustrated.
Bing Liu is frustrated with openNLP NER module.
Noam Chomsky is frustrated with the world.
Jayden Smith is frustrated.
Smith Jayden is frustrated.
Lady Gaga is frustrated.
Ms. Gaga is frustrated.
Mrs. Gaga is frustrated.
Jayden is frustrated.
Mr. Liu is frustrated.
Output (I changed diamonds to square brackets) :
[START:person] John Smith [END] is frustrated.
john smith is frustrated.
[START:person] Barak Obama [END] is frustrated.
Hugo Chavez is frustrated. (no more)
[START:person] Jeff Atwood [END] is frustrated.
Bing Liu is frustrated with openNLP NER module.
[START:person] Noam Chomsky [END] is frustrated with the world.
Jayden [START:person] Smith [END] is frustrated.
[START:person] Smith [END] [START:person] Jayden [END] is frustrated.
Lady Gaga is frustrated.
Ms. Gaga is frustrated.
Mrs. Gaga is frustrated.
Jayden is frustrated.
Mr. Liu is frustrated.
It seems that the model simply learns a fixed list of names that are annotated in the training data and allows some tiling and combinations. Two notable (FN) examples are:
-> I'm puzzled and frustrated and if anyone could point me to the algorithm (or verify it sucks) I'll be thankful.
p.s. both Stanford and UIUC NER systems perform much better with some subtle differences that are interesting but off topic (this question is too long as is)
As the name implies, NameFinderME uses a Maximum Entropy model. Here is the seminal paper on ME.
If OpenNLP's performance does not meets your requirements and you can not use Stanford or UIUC NERs, I recommend to try Mallet, using a CRF. This sample code should get you started.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With