This is an interview question: Find all (english word) substrings of a given string. (every = every, ever, very).
Obviously, we can loop over all substrings and check each one against an English dictionary, organized as a set. I believe the dictionary is small enough to fit the RAM. How to organize the dictionary ? As for as I remember, the original spell
command loaded the words
file in a bitmap
, represented a set of words hash values. I would start from that.
Another solution is a trie
built from the dictionary. Using the trie we can loop over all string characters and check the trie
for each character. I guess the complexity of this solution would be the same in the worst case (O(n^2)
)
Does it make sense? Would you suggest other solutions?
If the length of a string is N, then there can be N – K + 1 substring of length K. Generating these substrings will require O(N) complexity, and checking each substring requires O(K) complexity, hence making the overall complexity like O(N*K). Efficient approach: The idea is to use Window Sliding Technique.
The function strstr returns the first occurrence of a string in another string. This means that strstr can be used to detect whether a string contains another string. In other words, whether a string is a substring of another string.
A substring is a subset or part of another string, or it is a contiguous sequence of characters within a string.
The Aho-Corasick string matching algorithm which "constructs a finite state machine that resembles a trie with additional links between the various internal nodes."
But everything considered the "build a trie from the English dictionary and do a simultaneous search on it for all suffixes of the given string" should be pretty good for an interview.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With