Stemming English words with Lucene

Tags:

I'm processing some English texts in a Java application, and I need to stem them. For example, from the text "amenities/amenity" I need to get "amenit".

The function looks like:

String stemTerm(String term){    ... }

I've found the Lucene Analyzer, but it looks way too complicated for what I need. http://lucene.apache.org/java/2_2_0/api/org/apache/lucene/analysis/PorterStemFilter.html

Is there a way to use it to stem words without building an Analyzer? I don't understand all the Analyzer business...

EDIT: I actually need a stemming + lemmatization. Can Lucene do this?

787

asked Mar 22 '11 13:03

Mulone

1 Answers

SnowballAnalyzer is deprecated, you can use Lucene Porter Stemmer instead:

 PorterStemmer stem = new PorterStemmer();  stem.setCurrent(word);  stem.stem();  String result = stem.getCurrent();

Hope this help!

186

answered Sep 17 '22 18:09

arbc

Related questions
                            
                                How to instruct Jackson to serialize a field inside an Object instead of the Object it self?
                            
                                Spring 3.2 Test, com.jajway not included as dependency
                            
                                How to convert commands recorded in selenium IDE to Java?
                            
                                Why are 2 Long variables not equal with == operator in Java?
                            
                                Mono vs CompletableFuture
                            
                                Why does Java BigDecimal return 1E+1?
                            
                                How should I start Java-based web development? [closed]
                            
                                Significance of PermGen Space
                            
                                How to Exclude properties file from jar file?
                            
                                Is there any reason to prefer UTF-16 over UTF-8?
                            
                                java protected method accessibility
                            
                                Alternative to ui:fragment in JSF
                            
                                warning: [serial] serializable class SomeClass has no definition of serialVersionUID
                            
                                Which version of XPATH and XSLT am I using..?
                            
                                NumberFormatException while parsing date with SimpleDateFormat.parse()
                            
                                Most elegant way to join a Map to a String in Java 8
                            
                                Why doesn't Java tell you which pointer is null?
                            
                                Keytool set hostname
                            
                                Any way to _not_ call superclass constructor in Java?
                            
                                Is it good practice to use assert in Java?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Stemming English words with Lucene

Tags:

java

lucene

stemming

porter-stemmer

Mulone

People also ask

1 Answers

arbc

Recent Activity

Donate For Us