NLTK Named Entity Recognition with Custom Data

Tags:

I'm trying to extract named entities from my text using NLTK. I find that NLTK NER is not very accurate for my purpose and I want to add some more tags of my own as well. I've been trying to find a way to train my own NER, but I don't seem to be able to find the right resources. I have a couple of questions regarding NLTK-

Can I use my own data to train an Named Entity Recognizer in NLTK?
If I can train using my own data, is the named_entity.py the file to be modified?
Does the input file format have to be in IOB eg. Eric NNP B-PERSON ?
Are there any resources - apart from the nltk cookbook and nlp with python that I can use?

I would really appreciate help in this regard

449

asked Jul 04 '12 18:07

user1502248

1 Answers

Are you committed to using NLTK/Python? I ran into the same problems as you, and had much better results using Stanford's named-entity recognizer: http://nlp.stanford.edu/software/CRF-NER.shtml. The process for training the classifier using your own data is very well-documented in the FAQ.

If you really need to use NLTK, I'd hit up the mailing list for some advice from other users: http://groups.google.com/group/nltk-users.

Hope this helps!

answered Oct 08 '22 13:10

jjdubs

Related questions
                            
                                Fibonacci numbers, with an one-liner in Python 3?
                            
                                How to calculate next Friday?
                            
                                Python "protected" attributes
                            
                                How to find the average colour of an image in Python with OpenCV?
                            
                                Install mysql-python (Windows)
                            
                                Shorter, more pythonic way of writing an if statement
                            
                                What do you wish you'd known about when you started learning Python? [closed]
                            
                                Swapping 1 with 0 and 0 with 1 in a Pythonic way
                            
                                Is this the fastest way to group in Pandas?
                            
                                What does "the following packages will be superseded by a higher priority channel" mean?
                            
                                Asyncio vs. Gevent [closed]
                            
                                How to install python package with a different name using PIP
                            
                                How do you call Python code from C code?
                            
                                Why is numpy.any so slow over large arrays?
                            
                                How to print current logging configuration used by the python logging module?
                            
                                "outsourcing" exception-handling to a decorator [closed]
                            
                                RAW Image processing in Python [closed]
                            
                                Is it safe to yield from within a "with" block in Python (and why)?
                            
                                Why does an empty string in Python sometimes take up 49 bytes and sometimes 51?
                            
                                How can I get all rows with keys provided in a list using SQLalchemy?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

NLTK Named Entity Recognition with Custom Data

Tags:

python

nlp

nltk

named-entity-recognition

user1502248

People also ask

1 Answers

jjdubs

Recent Activity

Donate For Us