Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why are there different Lemmatizers in NLTK library?

>> from nltk.stem import WordNetLemmatizer as lm1
>> from nltk import WordNetLemmatizer as lm2
>> from nltk.stem.wordnet import WordNetLemmatizer as lm3

For me all of the three works the same way, but just to confirm, do they provide anything different?

like image 403
Abhishek Avatar asked Nov 09 '16 18:11

Abhishek


People also ask

Which lemmatizer is best?

Wordnet Lemmatizer It is one of the earliest and most commonly used lemmatizer technique.

What are lemmas in NLTK?

Lemmatization in NLTK is the algorithmic process of finding the lemma of a word depending on its meaning and context. Lemmatization usually refers to the morphological analysis of words, which aims to remove inflectional endings. It helps in returning the base or dictionary form of a word known as the lemma.

Should I do both lemmatization and stemming?

Short answer- go with stemming when the vocab space is small and the documents are large. Conversely, go with word embeddings when the vocab space is large but the documents are small. However, don't use lemmatization as the increased performance to increased cost ratio is quite low.


1 Answers

No they're not different they're all the same.

from nltk.stem import WordNetLemmatizer as lm1
from nltk import WordNetLemmatizer as lm2
from nltk.stem.wordnet import WordNetLemmatizer as lm3

lm1 == lm2 
>>> True


lm2 == lm3 
>>> True


lm1 == lm3 
>>> True

As corrected by erip why this is happening is because :

That Class(WordNetLemmatizer) is origanlly written in nltk.stem.wordnet so you can do from nltk.stem.wordnet import WordNetLemmatizer as lm3

Which is also import in nltk __init__.py file so you can do from nltk import WordNetLemmatizer as lm2

And is also imported in __init__.py nltk.stem module so you can do from nltk.stem import WordNetLemmatizer as lm1

like image 135
harshil9968 Avatar answered Oct 23 '22 09:10

harshil9968