The brute force way can solve the problem in O(n!), basically calculating all the permutations and checking the results in a dictionary. I am looking for ways to improve the complexity. I can think of building a tree out of the dictionary but still checking all letters permutations is O(n!). Are there better ways to solve this problem?
Letters can have duplicates.
The api for the function looks like this:
List<String> findValidWords(Dict dict, char letters[])
My English is not good so try to understand.
My approach is using bit/bitwise to increase speed. Still bruteforce, though.
FIRST STEP
We only consider distinct character in each word and mark its existence. English has 26 characters, so we need 26 bits. Integer is 32 bits. That's enough.
Now encode each words in dictionary to an integer number.
abcdddffg -> 123444667 -> 123467 (only distinct characters) -> 1111011 (bits) -> 123 (decimal number)
So 2,000,000 words will be converted into 2,000,000 integer numbers.
Now let say you have this set of letters: a,b,c,d,e
abcde -> 12345 -> 1111100 (bits)
Do AND operation and we have:
1111100 (abcde)
&
1111011 (abcdddffg, no e)
=
1111000 (result) => result != abcdddffg => word cannot be created
Other example with a,b,c,d,e,f,g,h:
11111111 (abcdefgh)
&
11110110 (abcdddffg, no e and h)
=
11110110 (result) => result == abcdddffg => word can be created
SECOND STEP
While converting word to number, store the letter count also. If we found a match in first step, we continue to check if the number of letters is enough too.
Depend on the requirement, you might not need this second step.
COMPLEXITY
You can sort each word in your dictionary so that the letters appear in the same order as they do in the alphabet, and then build a trie out of your sorted words. (where each node contains a list of all words that can be made out of the letters). (linear time in total letter length of dictionary) Then, given a set of query letters, sort the letters the same way and proceed through the trie using depth first search in all possible directions that use a subset of your letters from left to right. Any time you reach a node in the trie that contains words, output those words. Each path you explore can be charged to at least one word in the dictionary, so the worst case complexity to find all nodes that contain words you can make is O(kn) where n is the number of words in the dictionary and k is the maximum number of letters in a word. However for somewhat restricted sets of query letters, the running time should be much faster per query.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With