Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Stemming - code examples or open source projects?

Stemming is something that's needed in tagging systems. I use delicious, and I don't have time to manage and prune my tags. I'm a bit more careful with my blog, but it isn't perfect. I write software for embedded systems that would be much more functional (helpful to the user) if they included stemming.

For instance:
Parse
Parser
Parsing

Should all mean the same thing to whatever system I'm putting them into.

Ideally there's a BSD licensed stemmer somewhere, but if not, where do I look to learn the common algorithms and techniques for this?

Aside from BSD stemmers, what other open source licensed stemmers are out there?

-Adam

like image 423
Adam Davis Avatar asked Feb 27 '09 15:02

Adam Davis


2 Answers

Snowball stemmer (C & Java) I've used it's Python binding, PyStemmer

like image 129
vartec Avatar answered Sep 17 '22 23:09

vartec


Check out the nltk toolkit written in python. It has a very functional stemmer.

like image 38
Anand Avatar answered Sep 21 '22 23:09

Anand