I am trying to get whether a word is in singular form or in plural form by using nltk pos_tag. But the results are not accurate.
So, I need a way to find how can get whether a word is in singular form or in plural form? moreover I need it without using any python package.
The easiest way to tell if a noun is a singular noun or a plural noun is to look at how much of something it is referring to. If it is only referring to one person or thing, it is a singular noun. If it is referring to more than one person or thing, it is a plural noun.
A plural noun is the form of a noun used to show there are more than one. Most nouns simply add –s or –es to the end to become plural.
The difference between singular and plural nouns is easy to spot. When a noun indicates one only, it is a singular noun. When a noun indicates more than one, it is plural.
For English, every word should somehow have a root lemma where the default plurality is singular.
Assuming that you have only nouns in your list, you can try this:
from nltk.stem import WordNetLemmatizer
wnl = WordNetLemmatizer()
def isplural(word):
lemma = wnl.lemmatize(word, 'n')
plural = True if word is not lemma else False
return plural, lemma
nounls = ['geese', 'mice', 'bars', 'foos', 'foo',
'families', 'family', 'dog', 'dogs']
for nn in nounls:
isp, lemma = isplural(nn)
print nn, lemma, isp
You will have a problem when word is out of wordnet, then you have to use more sophiscated classifier or finite state machines out of NLTK
.
Assuming you want an English solution, you can do something similar to 2er0's solution a bit more directly with pattern-en:
from pattern.en import singularize
def isplural(pluralForm):
singularForm = singularize(pluralForm)
plural = True if pluralForm is not singularForm else False
return plural, singularForm
nounls = ['geese', 'mice', 'bars', 'foos', 'foo',
'families', 'family', 'dog', 'dogs']
for pluralForm in nounls:
isp, singularForm = isplural(pluralForm)
print pluralForm, singularForm, isp
which outputs
geese goose True
mice mouse True
bars bar True
foos foo True
foo foo False
families family True
family family False
dog dog False
dogs dog True
the only difference in output between 2er0's solution and this is
foos foo True
since his solution outputs False
, as he pointed out since foos
is not in wordnet (and not an English word at all).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With