Python. Python is the ideal coding language used for machine learning, NLP, and neural network connections.
Natural Language Processing (NLP) allows you to take any sentence and identify patterns, special names, company names, and more. The second edition of Natural Language Processing with Java teaches you how to perform language analysis with the help of Java libraries, while constantly gaining insights from the outcomes.
There are many things about Python that make it a really good programming language choice for an NLP project. The simple syntax and transparent semantics of this language make it an excellent choice for projects that include Natural Language Processing tasks.
Java vs Python for NLP is very much a preference or necessity. Depending on the company/projects you'll need to use one or the other and often there isn't much of a choice unless you're heading a project.
Other than NLTK
(www.nltk.org), there are actually other libraries for text processing in python
:
(for more, see https://pypi.python.org/pypi?%3Aaction=search&term=natural+language+processing&submit=search)
For Java
, there're tonnes of others but here's another list:
This is a nice comparison for basic string processing, see http://nltk.googlecode.com/svn/trunk/doc/howto/nlp-python.html
A useful comparison of GATE vs UIMA vs OpenNLP, see https://www.assembla.com/spaces/extraction-of-cost-data/wiki/Gate-vs-UIMA-vs-OpenNLP?version=4
If you're uncertain, which is the language to go for NLP, personally i say, "any language that will give you the desired analysis/output", see Which language or tools to learn for natural language processing?
Here's a pretty recent (2017) of NLP tools: https://github.com/alvations/awesome-community-curated-nlp
An older list of NLP tools (2013): http://web.archive.org/web/20130703190201/http://yauhenklimovich.wordpress.com/2013/05/20/tools-nlp
Other than language processing tools, you would very much need machine learning
tools to incorporate into NLP
pipelines.
There's a whole range in Python
and Java
, and once again it's up to preference and whether the libraries are user-friendly enough:
Machine Learning libraries in python:
(for more, see https://pypi.python.org/pypi?%3Aaction=search&term=machine+learning&submit=search)
With the recent (2015) deep learning tsunami in NLP, possibly you could consider: https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software
I'll avoid listing deep learning tools out of non-favoritism / neutrality.
Other Stackoverflow questions that also asked for NLP/ML tools:
The question is very open ended. That said, rather than choose one, below is a comparison depending on the language that you would like to use (since there are good libraries available in both languages).
Python
In terms of Python, the first place you should look at is the Python Natural Language Toolkit. As they note in their description, NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.
There is also some excellent code that you can look up that originated out of Google's Natural Language Toolkit project that is Python based. You can find a link to that code here on GitHub.
Java
The first place to look would be Stanford's Natural Language Processing Group. All of software that is distributed there is written in Java. All recent distributions require Oracle Java 6+ or OpenJDK 7+. Distribution packages include components for command-line invocation, jar files, a Java API, and source code.
Another great option that you see in a lot of machine learning environments here (general option), is Weka. Weka is a collection of machine learning algorithms for data mining tasks. The algorithms can either be applied directly to a dataset or called from your own Java code. Weka contains tools for data pre-processing, classification, regression, clustering, association rules, and visualization. It is also well-suited for developing new machine learning schemes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With