Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to create english language dictionary application with python (django)?

I would like to create an online dictionary application by using python (or with django).

It will be similar to http://dictionary.reference.com/.

PS: the dictionary is not stored in a database. it's stored in a text file or gunzip file. Free english dictionary files can be downloaded from this URL: dicts.info/dictionaries.php.

The easiest free dictionary file will be in the format of:

word1 explanation for word1 

word2 explanation for word2 

There are some other formats as well. but all are stored in either text file or text.gz file

My question is

(1) Are there any existing open source python package or modules or application which implements this functionality that I can use or study from?

(2) If the answer to the first question is NO. which algorithm should I follow to create such web application? Can I simply use the python built-in dictionary object for this job? so that the dictionary object's key will be the english word and the value will be the explanation. is this OK in term of performance? OR Do I have to create my own Tree Object to speed up the search? or any existing package which handles this job properly?

Thank you very much.

like image 448
SSS Avatar asked May 20 '10 07:05

SSS


People also ask

Is there an English dictionary in Python?

PyDictionary is an offline English dictionary made using Python along with the Wordnet Lexical Database and Enchant Spell Dictionary.

Can I import English dictionary into Python?

Download a JSON file containing English dictionary words in a python dictionaries data type format, or arrange the file content in that way. Create a folder and add the downloaded . json file and python script in that folder. In python editor, import the required modules.


2 Answers

You might want to check out http://www.nltk.org/ You could get lots of words and their definitions without having to worry about the implementation details of a database. If you're new to all this stuff, at the very least it would be useful to get you up and going, and then when you've got a working version, start putting in a database.

Here's a quick snippet of how to get all the available meanings of "dog" from that package:

from nltk.corpus import wordnet
for word_meaning in wordnet.synsets('dog'):
    print word_meaning.definition
like image 101
brainysmurf Avatar answered Oct 13 '22 17:10

brainysmurf


I'm not sure 'What' functionality you are talking about. If you mean 'searching keywords from a dictionnary that is recorded in your database', then python dictionnary is not a possible solution, as you would have to deserialize your whole database in order to make a search.

You should rather look towards the django 'search' applications. A lot of people advise to use haystack :

What's the best Django search app?

and use this search engine to look for some keyword in your database.

If you don't want to support sophisticated searches, then you could also query for an exact keyword in your database

DictEntry.objects.get(keyword=`something`).definition

I guess it all depends on the level of sophistication you want to achieve, but there can be extremely simple solutions.

EDIT :

If the dictionnaries come from files, then it's hard to say, you have plenty of solutions.

If the file is small, you could indeed deserialize it to a dictionnary when starting the server, and then always search in the same instance (so you wouldn't have to deserialize again for each request).

If the files are really big, you could consider migrating them to your database.

1) First create your Django models, so you would know what data you need, the name of your fields, etc... for example :

class DictEntry(Model):
    keyword = CharField(max_length=100)
    definition = CharField(max_length=100)

2) It seems like some of the files on the link you gave are in csv format (it seems also like you can have them in xml). With the csv module from standard library, you could extract these files to python.

3) and then with the json or yaml python libraries, you dump these files back to a different format (json or yaml) as described in initial data for your model. And magic your initial data is ready !

PS : the good thing with python : you google 'python json' you will find the official doc because a library for writing/reading json is part of the standard python lib !!! Idem with xml and csv ...

like image 31
sebpiq Avatar answered Oct 13 '22 17:10

sebpiq