Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

is there a dictionary i can download for java?

Tags:

java

is there a dictionary i can download for java? i want to have a program that takes a few random letters and sees if they can be rearanged into a real word by checking them against the dictionary

like image 358
David Avatar asked Dec 09 '22 16:12

David


2 Answers

Is there a dictionary i can download for java?

Others have already answered this... Maybe you weren't simply talking about a dictionary file but about a spellchecker?

I want to have a program that takes a few random letters and sees if they can be rearranged into a real word by checking them against the dictionary

That is different. How fast do you want this to be? How many words in the dictionary and how many words, up to which length, do you want to check?

In case you want a spellchecker (which is not entirely clear from your question), Jazzy is a spellchecker for Java that has links to a lot of dictionaries. It's not bad but the various implementation are horribly inefficient (it's ok for small dictionaries, but it's an amazing waste when you have several hundred thousands of words).

Now if you just want to solve the specific problem you describe, you can:

  • parse the dictionary file and create a map : (letters in sorted order, set of matching words)

  • then for any number of random letters: sort them, see if you have an entry in the map (if you do the entry's value contains all the words that you can do with these letters).

    abracadabra : (aaaaabbcdrr, (abracadabra))

    carthorse : (acehorrst, (carthorse) )

    orchestra : (acehorrst, (carthorse,orchestra) )

etc...

Now you take, say, three random letters and get "hsotrerca", you sort them to get "acehorrst" and using that as a key you get all the (valid) anagrams...

This works because what you described is a special (easy) case: all you need is sort your letters and then use an O(1) map lookup.

To come with more complicated spell checkings, where there may be errors, then you need something to come up with "candidates" (words that may be correct but mispelled) [like, say, using the soundex, metaphone or double metaphone algos] and then use things like the Levenhstein Edit-distance algorithm to check candidates versus known good words (or the much more complicated tree made of Levenhstein Edit-distance that Google use for its "find as you type"):

http://en.wikipedia.org/wiki/Levenshtein_distance

As a funny sidenote, optimized dictionary representation can store hundreds and even millions of words in less than 10 bit per word (yup, you've read correctly: less than 10 bits per word) and yet allow very fast lookup.

like image 184
SyntaxT3rr0r Avatar answered Dec 12 '22 05:12

SyntaxT3rr0r


Dictionaries are usually programming language agnostic. If you try to google it without using the keyword "java", you may get better results. E.g. free dictionary download gives under each dicts.info.

like image 27
BalusC Avatar answered Dec 12 '22 05:12

BalusC