Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to correctly remove the tense or plural from a word?

Tags:

python

nltk

Is it possible to to change words like running, helping, cooks, finds and happily into run, help, cook, find and happy using nltk?

like image 956
user3378734 Avatar asked Apr 12 '15 14:04

user3378734


2 Answers

There are some stemming algorithms implemented in nltk. It looks like Lancaster stemming algorithm will work for you.

>>> from nltk.stem.lancaster import LancasterStemmer
>>> st = LancasterStemmer()
>>> st.stem('happily')
'happy'
>>> st.stem('cooks')
'cook'
>>> st.stem('helping')
'help'
>>> st.stem('running')
'run'
>>> st.stem('finds')
'find'
like image 86
Irshad Bhat Avatar answered Oct 17 '22 15:10

Irshad Bhat


>>> from nltk.stem import WordNetLemmatizer
>>> wnl = WordNetLemmatizer()
>>> ls = ['running', 'helping', 'cooks', 'finds']
>>> [wnl.lemmatize(i) for i in ls]
['running', 'helping', u'cook', u'find']
>>> ls = [('running', 'v'), ('helping', 'v'), ('cooks', 'v'), ('finds','v')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
[u'run', u'help', u'cook', u'find']
>>> ls = [('running', 'n'), ('helping', 'n'), ('cooks', 'n'), ('finds','n')]
>>> [wnl.lemmatize(word, pos) for word, pos in ls]
['running', 'helping', u'cook', u'find']

See Porter Stemming of fried

like image 7
alvas Avatar answered Oct 17 '22 15:10

alvas