Source : Microsoft Interview Question
We are given a File containing words.We need to determine all the Anagrams Present in it .
Can someone suggest most optimal algorithm to do this.
Only way i know is Sorting all the words,then checking .
It would be good to know more about data before suggesting an algorithm, but lets just assume that the words are in English in the single case.
Lets assign each letter a prime number from 2 to 101. For each word we can count it's "anagram number" by multiplying its letter corresponding numbers.
Lets declare a dictionary of {number, list} pairs. And one list to collect resulting anagrams into.
Then we can collect anagrams in two steps: simply traverse through the file and put each word to a dictionary's list according to its "anagram number"; traverce the map and for every pairs list with length more then 1 store it's contents in a single big anagram list.
UPDATE:
import operator
words = ["thore", "ganamar", "notanagram", "anagram", "other"]
letter_code = {'a':2, 'b':3, 'c':5, 'd':7, 'e':11, 'f':13, 'g':17, 'h':19, 'i':23, 'j':29, 'k':31, 'l':37, 'm':41, 'n':43,
'o':47, 'p':53, 'q':59, 'r':61, 's':67, 't':71, 'u':73, 'v':79, 'w':83, 'x':89, 'y':97, 'z':101}
def evaluate(word):
return reduce( operator.mul, [letter_code[letter] for letter in word] )
anagram_map = {}
anagram_list = []
for word in words:
anagram_number = evaluate(word)
if anagram_number in anagram_map:
anagram_map[ anagram_number ] += [word]
else:
anagram_map[ anagram_number ] = [word]
if len(anagram_map[ anagram_number ]) == 2:
anagram_list += anagram_map[ anagram_number ]
elif len(anagram_map[ anagram_number ]) > 2:
anagram_list += [ word ]
print anagram_list
Of course the implementation can be optimized further. For instance, you don't really need a map of anagrams, just a counters would do fine. But I guess the code illustrates the idea best as it is.
You can use "Tries".A trie (derived from retrieval) is a multi way search tree. Tries use pattern matching algorithms. It's basic use is to create spell check programs, but I think it can help your case.. Have a look at this link http://ww0.java4.datastructures.net/handouts/Tries.pdf
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With