I've tried PorterStemmer and Snowball but both don't work on all words, missing some very common ones.
My test words are: "cats running ran cactus cactuses cacti community communities", and both get less than half right.
See also:
2021 Jul.07. Stemming and lemmatization are methods used by search engines and chatbots to analyze the meaning behind a word. Stemming uses the stem of the word, while lemmatization uses the context in which the word is being used.
In order to lemmatize, you need to create an instance of the WordNetLemmatizer() and call the lemmatize() function on a single word. Let's lemmatize a simple sentence. We first tokenize the sentence into words using nltk. word_tokenize and then we will call lemmatizer.
I think stemming a lemmatized word is redundant if you get the same result than just stemming it (which is the result I expect). Nevertheless, the decision between stemmer and lemmatizer depends on your need. My intuition said that steamming increses recall and lowers precision and the opposite for a lemmatization.
Lemmatization is the grouping together of different forms of the same word. In search queries, lemmatization allows end users to query any version of a base word and get relevant results.
If you know Python, The Natural Language Toolkit (NLTK) has a very powerful lemmatizer that makes use of WordNet.
Note that if you are using this lemmatizer for the first time, you must download the corpus prior to using it. This can be done by:
>>> import nltk >>> nltk.download('wordnet')
You only have to do this once. Assuming that you have now downloaded the corpus, it works like this:
>>> from nltk.stem.wordnet import WordNetLemmatizer >>> lmtzr = WordNetLemmatizer() >>> lmtzr.lemmatize('cars') 'car' >>> lmtzr.lemmatize('feet') 'foot' >>> lmtzr.lemmatize('people') 'people' >>> lmtzr.lemmatize('fantasized','v') 'fantasize'
There are other lemmatizers in the nltk.stem module, but I haven't tried them myself.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With