Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where can I find a text list or library that contains a list of common foods? [closed]

I'm writing a Python script that parses emails which involves searching the text of the email for any words that are common food items. I need some way to determine whether words are indeed food items.

I've looked at several natural language processing APIs (such as AlchemyAPI and NLTK 2.0) and they appear to have Named Entity Extraction (which is what I want), but I can't find an entity type for food in particular.

It's quite possible that natural language processing is overkill for what I need-- I just want a list of foods that I can match to. Where can I find such a word list? Do I need to write my own scraper to parse it off some online source, or is there an easier way?

like image 644
abergal Avatar asked Oct 28 '13 03:10

abergal


3 Answers

It would be really nice to have all the food items into one single list but sadly that's the ideal case.

You can try accessing the food synset in WordNet. If you are using NLTK, try:

>>> from nltk.corpus import wordnet as wn
>>> food = wn.synset('food.n.02')
>>> list(set([w for s in food.closure(lambda s:s.hyponyms()) for w in s.lemma_names()]))
like image 127
alvas Avatar answered Nov 10 '22 19:11

alvas


AFAIK, there is no entity of common foods for NLTK or similar. It's quite likely you have to construct a list for yourself.

But, thankfully, the internet is your friend, here are a few good sources to start with that cover a lot of common vegetables and fruits in the English-speaking world:

  • http://vegetablesfruitsgrains.com/list-of-vegetables/
  • http://edis.ifas.ufl.edu/features/fruitvegindex.html
  • http://www.enchantedlearning.com/wordlist/vegetables.shtml

Good luck!

like image 45
jrd1 Avatar answered Nov 10 '22 21:11

jrd1


Since named entities are proper nouns (i.e. people, places, companies, locations, etc.), it's unlikely that NLP entity extraction will work for finding common food names. The NLP function that might work is keyword extraction. I ran a few recipes through AlchemyAPI's demo and the ingredients are identified as keywords. So that gets you part of the way there, but you'll still need to compare the keywords to a list of common food items, like jrd1 mentioned.

like image 1
Steve Avatar answered Nov 10 '22 21:11

Steve