I do know how to perform SnowballStemmer on a single word (in my case, on russian one). Doing the next things:
from nltk.stem.snowball import SnowballStemmer
stemmer = SnowballStemmer("russian")
stemmer.stem("Василий")
'Васил'
How can I do the following if I have a list of words like ['Василий', 'Геннадий', 'Виталий']?
My approach using for loop seems to be not working :(
l=[stemmer.stem(word) for word in l]
Snowball stemmer: This algorithm is also known as the Porter2 stemming algorithm. It is almost universally accepted as better than the Porter stemmer, even being acknowledged as such by the individual who created the Porter stemmer.
Stemming is a technique used to extract the base form of the words by removing affixes from them. It is just like cutting down the branches of a tree to its stems. For example, the stem of the words eating, eats, eaten is eat. Search engines use stemming for indexing the words.
Stem (root) is the part of the word to which you add inflectional (changing/deriving) affixes such as (-ed,-ize, -s,-de,mis). So stemming a word or sentence may result in words that are not actual words. Stems are created by removing the suffixes or prefixes used with a word.
The tm package in R provides the stemDocument() function to stem the document to it's root. This function either takes in a character vector and returns a character vector, or takes in a PlainTextDocument and returns a PlainTextDocument. example: stemDocument(running,runs,ran) would return (run,run,ran) as the ouput.
Your variable l
is not pre-defined, causing the name error. See my last two lines for fix.
>>> from nltk.stem.snowball import SnowballStemmer
>>> stemmer = SnowballStemmer("russian")
>>> my_words = ['Василий', 'Геннадий', 'Виталий']
>>> l=[stemmer.stem(word) for word in l]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'l' is not defined
>>> l=[stemmer.stem(word) for word in my_words]
>>> l
['васил', 'геннад', 'витал']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With