Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using Conditional Random Fields for Named Entity Recognition

What is Conditional Random Field? How does exactly Conditional Random Field identify proper names as person, organization, or place in a structured or unstructured text?

For example: This product is ordered by StackOverFlow Inc.

What does Conditional Random Field do to identify StackOverFlow Inc. as an organization?

like image 866
user239135 Avatar asked Dec 27 '09 11:12

user239135


People also ask

What are conditional random fields used for?

Conditional random fields (CRFs) are a class of statistical modeling methods often applied in pattern recognition and machine learning and used for structured prediction. Whereas a classifier predicts a label for a single sample without considering "neighbouring" samples, a CRF can take context into account.

How is CRF used for NER?

NER using CRF is based on undirected graphical model of conditionally trained probabilistic finite state automata. CRF is used to calculate the conditional probability of values on designated output nodes given values on other designated input nodes. It incorporates dependent features and context dependent learning.

What is conditional random fields in NLP?

Conditional Random Fields (CRF) CRF is a discriminant model for sequences data similar to MEMM. It models the dependency between each state and the entire input sequences. Unlike MEMM, CRF overcomes the label bias issue by using global normalizer.

How is named entity recognition done?

The named entity recognition (NER) is one of the most data preprocessing task. It involves the identification of key information in the text and classification into a set of predefined categories. An entity is basically the thing that is consistently talked about or refer to in the text. NER is the form of NLP.


2 Answers

A CRF is a discriminative, batch, tagging model, in the same general family as a Maximum Entropy Markov model.

A full explanation is book-length.

A short explanation is as follows:

  1. Humans annotate 200-500K words of text, marking the entities.
  2. Humans select a set of features that they hope indicate entities. Things like capitalization, or whether the word was seen in the training set with a tag.
  3. A training procedure counts all the occurrences of the features.
  4. The meat of the CRF algorithm search the space of all possible models that fit the counts to find a pretty good one.
  5. At runtime, a decoder (probably a Viterbi decoder) looks at a sentence and decides what tag to assign to each word.

The hard parts of this are feature selection and the search algorithm in step 4.

like image 61
bmargulies Avatar answered Sep 24 '22 04:09

bmargulies


Well to understand that you got to study a lot of things.
For start

Understand the basic of markov and bayesian networks.
Online course available in coursera by daphne coller
https://class.coursera.org/pgm/lecture/index

CRF is a special type of markov network where we have observation and hidden states.
The objective is to find the best State Assignment to the unobserved variables also known as MAP problem.
Be Prepared for a lot of probability and Optimization. :-)

like image 45
Dhruv Premi Avatar answered Sep 20 '22 04:09

Dhruv Premi