Given
a dictionary full of words {in, july, den, dentist, best, ...}
with some C++ API to access it: boolean findWord(string word)
, or string getNextWord(void)
to iterate through it,
some input string with no space, e.g.: bestdentistinjuly
...
Output
best dentist in july is...
(basically separate the non-space string by space given a dictionary)What will be the best algorithm to solve it?
A subtle but important question is, is there any fancy way to solve the unreachable dead-end problem. E.g., den
and dentist
are both valid words to dissect the rest of the string, one of them may just be a dead-end.
To me it seems like a greedy problem or something solvable by dynamic programming..
To convert a string to dictionary, we have to ensure that the string contains a valid representation of dictionary. This can be done by eval() function. Abstract Syntax Tree (ast) module of Python has literal_eval() method which safely evaluates valid Python literal structure.
Method #1 : Using json.loads() This task can easily be performed using the inbuilt function of loads of json library of python which converts the string of valid dictionary into json form, dictionary in Python.
eval() is an inbuilt python library function used to convert string to dictionary efficiently. For this approach, you have to import the ast package from the python library and then use it with the literal_eval() method.
Second, a dictionary key must be of a type that is immutable. For example, you can use an integer, float, string, or Boolean as a dictionary key.
Use a Trie to store the dictionary. You can see a simple implementation (C#) at How to create a trie in c#
You're going to need to do a search because you don't know if you are on the right track until you have considered the whole input string. You'll need to iterate through the input string, at the same time as descending into the trie. When you get to a terminal node of the trie, you have a branch in your search process: you need to both treat that as the end of a word and treat it as the first part of a longer word.
You can create a kind of word tree:
You can go throught the string with no space. Once you find a word in your list, you add a space and you continu... until you cannot go further.
Then you go back to the previous word and try to se if adding new letter you can create a word, and if you can continu from their.
You try this until you tried all the possiblities.
If you go back to the starting word and you don't find any solution => no sol
Here is the algorithm ( my pseudocode syntax is not good, but you can get the general idea. I believe you will have to correct it a little):
TreeWordResult //Tree that keeps the results in memory
Go through the InputString:
If you find a word in the InputDictionnary
Then add this word to the last node of the treeWordResult
Otherwise:
while (No word found):
go back in the treeWordResult
try to find word in InputDictionnary different from the one before (on the other node)
endwhile
if no word found:
return NO SOLUTION
otherwise:
continue going through word
endif
endif
return Leaf
Algorithm ends when you find no sol, or when your at a "leaf" (you went thhrough the whole string)
Here is an illustration using your example:
Hope my explaination is clear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With