Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identify names in a string

I would like to find a good way of identifying names of people, places, etc. within users search queries on my site. For example, if a user asks "how old is George Washington", I need to be able to know from a predefined list that George Washington is a person.

Some of the lists will be global, and some will be user specific. For example, if they asked "how old is John Smith" I may only want to identify the particular John Smith that is my associate--and I wouldn't want to identify him as a person if he's not my associate.

Is there any NLP library or crawling of these lists I could do to leverage Soundx, mature NLP, misspell, etc. functionality? I can write it by hand, but I would rather leverage something mature. Thanks.

like image 707
Jeff Avatar asked Dec 01 '25 04:12

Jeff


2 Answers

What you need is called Named Entity Recognition

One of the best available software to do it comes with Stanford NLP: http://nlp.stanford.edu/software/CRF-NER.shtml (written in Java)

If you are on another platform, there are good open source projects in Ruby and Python. Search for "Named Entity Recognition".

like image 89
Blacksad Avatar answered Dec 02 '25 21:12

Blacksad


The particular Natural Language Processing (NLP) task that you're looking for is called Named Entity Recognition (NER)

Other than the Stanford's CRF-NER (in java), a popular python choice from Natural Language ToolKit (NLTK) is often used as a baseline for NER tasks.

You can try installing NLTK then execute the following code:

>>> from nltk.tokenize import word_tokenize
>>> from nltk.tag import pos_tag
>>> from nltk.chunk import ne_chunk
>>> sentence = "How old is John Smith?"
>>> ne_chunk(pos_tag(word_tokenize(sentence)))
Tree('S', [('How', 'WRB'), ('old', 'JJ'), ('is', 'VBZ'), Tree('PERSON', [('John', 'NNP'), ('Smith', 'NNP')]), ('?', '.')])
like image 45
alvas Avatar answered Dec 02 '25 23:12

alvas



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!