Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove plurals in a list of nouns?

Tags:

python

I have a list of strings:

['bill', 'simpsons', 'cosbys', 'cosby','bills','mango', 'mangoes']

What is the best to remove all the plurals from this list? So, I want the output to be:

['bill', 'simpsons', 'cosby','mango']
like image 524
Bruce Avatar asked Nov 13 '11 04:11

Bruce


1 Answers

In general, the process is called `stemming', and there is a package called 'stemming' for python.

Used like so:

from stemming.porter2 import stem
stem("simpsons")

Stemming does more than just stem plurals, but you could modify the stemming package to only perform the plural stemming. Take a look at the source: http://tartarus.org/martin/PorterStemmer/python.txt

like image 130
Anthony Blake Avatar answered Sep 30 '22 14:09

Anthony Blake