Currently I'm reading Natural Language Processing for the Working Programmer (a work in progress book http://nlpwp.org/) and wondering if there is a decent library for statistical natural language processing tasks.
1. Natural Language Toolkit (NLTK) NLTK is an essential library supports tasks such as classification, stemming, tagging, parsing, semantic reasoning, and tokenization in Python. It's basically your main tool for natural language processing and machine learning.
I don't think there is a single library that does a lot of the tasks that statistical NLP library users would expect (Warning: I don't know very much about statistical natural language processing). There are some interesting looking general-purpose core components like the NGrams, estimators, logfloat and hmm libraries. There are also some tools that do some very specific tasks, like morfette for morphology or the hs-gizapp which wraps around GIZA++ for getting word alignments between pairs of documents
Keep an eye on the NLP section of Hackage and do consider joining the Haskell NLP community (the site is currently down due to a recent attack on the haskell community server)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With