Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fully parsable dictionary/thesaurus

I'm in the early stages of designing a series of simple word games which I hope will help me learn new words. A crucial part of the ideas that I have is a fully parsable dictionary; I want to be able to use regular expressions to search the dictionary for given words and extract certain other bits of information (e.g. definition, type (noun/verb...), synonyms, antonyms, quotes demonstrating the word in use, etc). I currently have Wordbook (mac app) which I find okay, but haven't figured out if I can parse it using a python script. I'm assuming I can't, and was wondering if anyone knows of a reasonable dictionary that will allow this. Ideally I would do all this independent of the internet.

Thanks

like image 948
Paul Patterson Avatar asked May 23 '11 22:05

Paul Patterson


3 Answers

The nltk wordnet corpus provides a programmatic interface to a "large lexical database of English words". You can navigate the word graph based on a variety of relationships. It meets the requirements for showing "definition, part-of-speech, synonyms, antonyms, quotes", and "from a dictionary which is ideally downloadable".

Another option would be to download a recent snapshot of Wiktionary data and parse it into a format you can use, but this may be a bit involved (unless a decent Python Wiktionary parser already exists).

Here is an example of printing out some attributes using Wordnet:

import textwrap
from nltk.corpus import wordnet as wn

POS = {
    'v': 'verb', 'a': 'adjective', 's': 'satellite adjective', 
    'n': 'noun', 'r': 'adverb'}

def info(word, pos=None):
    for i, syn in enumerate(wn.synsets(word, pos)):
        syns = [n.replace('_', ' ') for n in syn.lemma_names]
        ants = [a for m in syn.lemmas for a in m.antonyms()]
        ind = ' '*12
        defn= textwrap.wrap(syn.definition, 64)
        print 'sense %d (%s)' % (i + 1, POS[syn.pos])
        print 'definition: ' + ('\n' + ind).join(defn)
        print '  synonyms:', ', '.join(syns)
        if ants:
            print '  antonyms:', ', '.join(a.name for a in ants)
        if syn.examples:
            print '  examples: ' + ('\n' + ind).join(syn.examples)
        print

info('near')

Output:

sense 1 (verb)
definition: move towards
  synonyms: approach, near, come on, go up, draw near, draw close, come near
  examples: We were approaching our destination
            They are drawing near
            The enemy army came nearer and nearer

sense 2 (adjective)
definition: not far distant in time or space or degree or circumstances
  synonyms: near, close, nigh
  antonyms: far
  examples: near neighbors
            in the near future
            they are near equals
...
like image 100
samplebias Avatar answered Oct 12 '22 22:10

samplebias


Wordnik has a Python API

like image 36
JiminyCricket Avatar answered Oct 12 '22 22:10

JiminyCricket


To my knowledge, dictionary.com offers a free API for noncommercial use here. You might be able to pull some of the data off the internet.

like image 38
Aaa Avatar answered Oct 12 '22 23:10

Aaa