Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can one resolve synonyms in named-entity recognition?

In natural language processing, named-entity recognition is the challenge of, well, recognizing named entities such as organizations, places, and most importantly names.

There is a major challenge in this though that I call that of synonymy: The Count and Dracula are in fact referring to the same person, but it it possible that this is never discussed directly in the text.

What would be the best algorithm to resolve these synonyms?


If there is a feature for this in any Python-based library, I'm eager to be educated. I'm using NLTK.

like image 473
Sean Allred Avatar asked Apr 05 '13 13:04

Sean Allred


People also ask

How do you improve named entity recognition?

In this paper, we improve NER by leveraging different types of syntactic information through attentive ensemble, which functionalizes by the proposed key-value memory networks, syntax attention, and the gate mechanism for encoding, weighting and aggregating such syntactic information, respectively.

What is named entity recognition explain with the help of examples?

When we read a text, we naturally recognize named entities like people, values, locations, and so on. For example, in the sentence “Mark Zuckerberg is one of the founders of Facebook, a company from the United States” we can identify three types of entities: “Person”: Mark Zuckerberg. “Company”: Facebook.

How is named entity recognition done?

Named Entity Recognition is a process where an algorithm takes a string of text (sentence or paragraph) as input and identifies relevant nouns (people, places, and organizations) that are mentioned in that string.

What are the different methods for named entity extraction?

There are three major approaches to NER: lexicon-based, rule-based, and machine learning based.


1 Answers

You are describing a problem of coreference resolution and named entity linking. I'm providing separate links as I am not entirely sure which one you meant.

  • Coreference: Stanford CoreNLP currently has one of the best implementations, but is in Java. I have used the python bindings and I wasn't too happy- I ended up running all my data through the Stanford pipeline just once, and then loading the processed XML files in python. Obviously, that doesn't work if you have to be processing in real time.
  • Named entity linking: Check out Apache Stanbol and the links in the following Stackoverflow post.
like image 194
mbatchkarov Avatar answered Oct 11 '22 03:10

mbatchkarov