Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

English dictionary as txt or xml file with support of synonyms [closed]

Tags:

Can someone point me to where I can download English dictionary as a txt or xml file. I am building a simple app for myself and looking for something what I could start using immediately without learning complex API.

Support for synonyms would be great, that is it should be easier to retrieve all the synonyms for a particular word.

It would be absolutely fantastic if the dictionary would be listing British and American spelling of the words where they differ.

Even if it would be small dictionary (a few thousand words) that's OK, I only need it for a small project.

I even would be willing to buy one if the price is reasonable, and the dictionary is easy to use - simple XML would be great.

Any directions please.

like image 300
Simon Avatar asked Apr 19 '10 11:04

Simon


People also ask

What is the synonym of the word support?

assist, help, prop (up), second, side (with)

On which library tool can you find the synonyms and antonyms of words?

A reference book that contains synonyms and antonyms is called a 'thesaurus.

What is the another name of file?

A list or directory of something. A folder used to keep documents in order. (computing) An aggregation of data on a storage device, identified by a name.


3 Answers

WordNet is what you want. It's big, containing over a hundred thousand entries, and it's freely available.

However, it's not stored as XML. To access the data, you'll want to use one of the existing WordNet APIs for your language of choice.

Using the APIs is generally pretty straightforward, so I don't think you have to worry much about "learning (a) complex API". For example, borrowing from the WordNet How to for the Python based Natural Language Toolkit (NLTK):

 >>> from nltk.corpus import wordnet
 >>> 
 >>> # Get All Synsets for 'dog'
 >>> # This is essentially all senses of the word in the db
 >>> wordnet.synsets('dog')
 [Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), 
  Synset('cad.n.01'), Synset('frank.n.02'),Synset('pawl.n.01'), 
  Synset('andiron.n.01'), Synset('chase.v.01')]
 
 >>> # Get the definition and usage for the first synset
 >>> wn.synset('dog.n.01').definition
 'a member of the genus Canis (probably descended from the common 
 wolf) that has been domesticated by man since prehistoric times; 
 occurs in many breeds'
 >>> wn.synset('dog.n.01').examples
 ['the dog barked all night']

 >>> # Get antonyms for 'good'
 >>> wordnet.synset('good.a.01').lemmas[0].antonyms()
 [Lemma('bad.a.01.bad')]

 >>> # Get synonyms for the first noun sense of 'dog'
 >>> wordnet.synset('dog.n.01').lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]

 >>> # Get synonyms for all senses of 'dog'
 >>> for synset in wordnet.synsets('dog'): print synset.lemmas
 [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), 
 Lemma('dog.n.01.Canis_familiaris')]
 ...
 [Lemma('frank.n.02.frank'), Lemma('frank.n.02.frankfurter'), 
 ...

While there is an American English bias in WordNet, it supports British spellings and usage. For example, you can look up 'colour' and one of the synsets for 'lift' is 'elevator.n.01'.

Notes on XML

If having the data represented as XML is essential, you could easily use one of the APIs to access the WordNet database and convert it into XML, e.g. see Thinking XML: Querying WordNet as XML.

like image 136
dmcer Avatar answered Oct 14 '22 06:10

dmcer


I know this question is quite old but I had problems myself for finding that as a txt file, so if anyone would be looking synonyms and antonyms txt file database the simplest yet very detailed try https://ia801407.us.archive.org/10/items/synonymsantonyms00ordwiala/synonymsantonyms00ordwiala_djvu.txt .

like image 26
pc_ Avatar answered Oct 14 '22 06:10

pc_


I have used Roget's thesaurus in the past. It has the synonymy information in plain text files. There is also some java code to help you parse the text.

These pages provides links to a bunch of thesauri/lexical resources some of which are freely downloadable.

http://www.w3.org/2001/sw/Europe/reports/thes/thes_links.html

http://www-a2k.is.tokushima-u.ac.jp/member/kita/NLP/lex.html

like image 37
hashable Avatar answered Oct 14 '22 05:10

hashable