Which word stemmer should I use in nltk?

Tags:

My goal is to analyze some corpus (twitter for the now) for emotional content. Just today I realized it would make a bit of sense to search for word stems as opposed to having an exhaustive list of emotional word stems. And so I've been exploring nltk.stem only to realize that there are 4 different stemmers. I'd like to ask the stackoverflow linguists whether LancasterStemmer, PorterStemmer, RegexpStemmer, RSLPStemmer, or WordNetStemmer is best preferably with some justification.

306

asked Aug 12 '09 08:08

speciousfool

1 Answers

It may be a bit different than you are asking, but the Nodebox Lingustics library contains an is_emotive() function which seems to check words to see if they are recursive hyponyms of certain emotional words. From commonsense.py

    ekman = ["anger", "disgust", "fear", "joy", "sadness", "surprise"]
    other = ["emotion", "feeling", "expression"]

Not a stemmer, but an interesting approach to check out.

123

answered Sep 20 '22 06:09

tomcat23

Related questions
                            
                                How To Parse Verbs Using Spacy
                            
                                Anaphora resolution in stanford-nlp using python
                            
                                NLP: Building (small) corpora, or "Where to get lots of not-too-specialized English-language text files?"
                            
                                An algorithm for declension of nouns of Polish/Slavic languages
                            
                                PHP implementation of Bayes classificator: Assign topics to texts
                            
                                Implementing Read typeclass where parsing strings includes "$"
                            
                                How to get logical parts of a sentence with java?
                            
                                Is there software that outputs speech-to-text at the Phonological level?
                            
                                How can I use Python NLTK to identify collocations among single characters?
                            
                                Justadistraction: tokenizing English without whitespaces. Murakami SheepMan
                            
                                Checking if a string contains an English sentence
                            
                                Python NLP British English vs American English
                            
                                understanding semcor corpus structure h
                            
                                Word Stemming in iOS - Not working for single word
                            
                                Linguistic tagger incorrectly tagging as 'OtherWord'
                            
                                Is there a fairly simple way for a script to tell (from context) whether "her" is a possessive pronoun?
                            
                                How to filter word permutations to only find semantically correct ngrams? (Python 3, NLTK)
                            
                                LSA - Latent Semantic Analysis - How to code it in PHP?
                            
                                Natural Language Processing - Word Alignment

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Which word stemmer should I use in nltk?

Tags:

nltk

linguistics

speciousfool

People also ask

1 Answers

tomcat23

Recent Activity

Donate For Us