Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are StanfordNER Classifiers built

I am working with StanfordNER classifiers. There are 4 classifiers as

english.all.3class.distsim.crf.ser.gz
english.muc.7class.distsim.crf.ser.gz
english.conll.4class.distsim.crf.ser.gz
example.serialized.ncc.ncc.ser.gz

How are these classifiers built? Since each of them is based on a different corpus, here is my guess

  1. Train a machine learning classifier like SVM coupled with OVR (for multi label case) on the corpus to detect entities like ORGANIZATION,PERSON,LOCATION etc. This means that the training data would be the entire text of a document in the corpus. For that piece of text we explicitly indicate the ORGANIZATIONs,PERSONs and LOCATIONs. Thus the classifiers would be able to predict those entities.

  2. Train a machine learning classifier to link POS tags with entities like ORGANIZATION,PERSON,LOCATION. For example, a classifier can be trained to predict which proper nouns should be ORGANIZATION

Is this the correct big picture? I am just trying to work out how to build my own NER.

like image 712
AbtPst Avatar asked Apr 29 '26 11:04

AbtPst


1 Answers

Yes, the models are trained on supervised data. They're 1st order CRFs which do multi-class probabilistic sequence classification (so not OVR, not SVM). You can find an introduction to NER and Stanford NER in particular on the Stanford NER page.

like image 115
Christopher Manning Avatar answered May 02 '26 03:05

Christopher Manning



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!