I am looking for a simple but "good enough" Named Entity Recognition library (and dictionary) for java, I am looking to process emails and documents and extract some "basic information" like: Names, places, Address and Dates
I've been looking around, and most seems to be on the heavy side and full NLP kind of projects.
Any recommendations ?
Stanford Named Entity Recognizer (SNER): this JAVA tool developed by Stanford University is considered the standard library for entity extraction. It's based on Conditional Random Fields (CRF) and provides pre-trained models for extracting person, organization, location, and other entities.
There are two main models used to achieve this goal: Ontology-based models and Deep Learning-based models. Ontology-based Named Entity Recognition uses a knowledge-based recognition process that relies on lists of datasets, such as a list of company names for the company category, to make inferences.
Ambiguity and Abbreviations -One of the major challenges in identifying named entities is language. Recognizing words which can have multiple meanings or words that can be a part of different sentences. Another major challenge is classifying similar words from texts.
NER tagging is a supervised task. You need a training set of labeled examples to train a model for that. However, there is some unsupervised work one can do to slightly improve the performance of models.
You might want to have a look at one of my earlier answers to a similar problem.
Other than that, most lighter NER systems depend a lot on the domain used. You will find a whole lot of tools and papers about biomedical NER systems, for example. In addition to my previous post (which already contains my main recommendation if you want to do NER), here are some more tools you might want to look into:
One additional remark: you won't get away without tokenization on the input. Tokenization of natural language is slightly non-trivial, that's why I suggest you use a toolbox that does both for you.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With