Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Snowball Stemmer Usage

I'd like to use the stemmer here for merging word counts.
http://snowball.tartarus.org/download.html
The page has a download link, but I'm not sure how to integrate the files into my eclipse project
Its not just a jar to drop into my lib folder, its a file system. Does anyone know of some documentation explaining this, as I didn't see any on the website.
(As in, what do i import, how do I call it etc..)

like image 936
Lemonio Avatar asked Jul 30 '13 19:07

Lemonio


1 Answers

Build the jar file and add it to your Build Path.

Details:

  • Download the tgz with the code from here http://snowball.tartarus.org/download.php
  • Uncompress.
  • Go to libstemmer_java directory and read README.
  • Follow instructions to compile (using javac).
  • You might have to correct or remove java/org/tartarus/snowball/ext/frenchStemmer.java because it has an error and doesn't compile.
  • Create jar file: Go to libstemmer_java/java directory then jar cvf libstemmer.jar *
  • Add libstemmer.jar to your Build Path (in Eclipse: Project-Properties-Java Build Path-Libreries Tab).

Then you can use the stemmers doing something like:

import org.tartarus.snowball.ext.spanishStemmer;
...
spanishStemmer stemmer = new spanishStemmer();
stemmer.setCurrent("torero");
if (stemmer.stem()){
    System.out.println(stemmer.getCurrent());
}
like image 54
mjaque Avatar answered Sep 23 '22 07:09

mjaque